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fjyjfl fff the inTMitton 

The present invention concerns a novel enzyme having endoglucanase activity. The enzyn« 
is isolated ftom the fungus Trichodema reesei. The invention also relates to an isolated «>d 
purified DNA sequence coding for tte novel enzyme as weU as vectors, yeast stnm^ 

fi„^ strains containing the DNA sequence. Furthermore, the invemion concerns a method 
for isolating the DNA sequence coding for the novel enzyme and for consttuctii* fi^ 

strains which are capable of expressing endoglucanase. The invention also provides an 
enzyme product having endoglucanase activity and methods for enzymatically modifying 
ceUulosic/UgnoceUulosic materiris. in particular for modification or degradation of cellulose 
and/or 0-glucaii. 



ff ftpl ^ground jfff t^f invention 



I M til 



fiinds 



Many fungal species produce enzymes that degrade plam polymers into simple c. 
like sugars. The fungus 7>ichodema reesei is one of the most potent and most - 
organisms degrading celhilose. It produces in the enzyme types needed for efik^ 

down of crystalline cellulose, namely endo.l.4^Ditfucai«.es (EC 3.2.1,4). ccllobiohydro- 
lases (exo-l.441.D-glucan.ses. EC 3.2.1.91) and 1.4-B-D-gtocosidases (EC 4.3.2.21). The 
^mber of enzymes belonging to each class is far ftom dear. b« the existence of at 1^ 

cellobiohydrolases. CBHI and CBHO. and two endoghicanases, EGl and EGH (formerly 
EOm ) has been confirmed by clomng of the corresponding genes (Shoemaker et al. 1983. 
Teeri et al. 1983. PemtiUI et al. 1986, Chen et al. 1987. Teeri et al. 1987. van ArsdeU et al. 
1987. Sidobeimo et al. 1988). 

It is known in the art tluit the different types of ceUutolyiic enzymes mentioned a^^^ 
different parts of Ihe cellulose molecule, and that celhilose hydrolyzation by a ceUulase 
nuxtme is the result of synergy between its componems. mrefore. if total hydrolysis of a 
cellulose substrate is aimed at, it is gencraUy requir«l that the ceUulase mixmie comain <J- 
glucosidases, ceUobiohydrolases as weU as endoglucanases. As memioned above. 
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Trichoderma reesei produces such enzyme mixcures. 



PCT/F194/00234 



It is also known that the cellulase enzymes belonging to the same class are mutually different 
as regards their activities towards some of the cellulosic substrates. For example. CBHII will 
5 catalyze hydrolysis of |3-glucan whereas CBHI is inactive .toward that substrate. 

The cellulase enzymes usually consist iimctionally of two different paru, viz. a core and a 
tail, which are interconnected by an intermittent part (known as the linker). The active centre 
of the enzyme is located in the core. The function of the tail consists mainly of its capability 
10 to attach the en^me to an insoluble substrate. Thus, if the tail is removed the activity of the 
enzyme toward macromolecular and crystalline substrates can be substantially decreased. 

By way of a general defmition, the name "endoglucanases" is assigned to enzymes that 
catalyze random hydrolysis of /S-1-4 glycosidic bonds between glucose units of cellulose 

15 polymers. The two major Trichoderma endoglucanases, EGI and EGII, contain about 500 to 
600 amino acids and their molecular weights arc about 50 to 60 kDa. Also the ccUobiohydro- 
lases are similar in size. These kinds or rather bulky molecules may have difficulties in 
penetrating some fibrous substrates whose adjacent polysaccharide chains are aligned and 
located close to each other. Such substrates are represented by fibrous materials of great 

20 economic vahies, such as cellulose pulp. Therefore, endoghKianases of a low molecular 
weight have been of an increasing interest during the last years. 

Hikansson et al. (1978) have purified a small endoglucanase from culture filtrates of 
r reeseL This enzyme has a size of about 20 kDa, a neutral pi and. unlike the major 

25 cellulases, it docs not contam carbohydrate moieties. H&kansson et al. found the en^rme to 
be present in the culture medhmi in very small amounts. Small endoglucanases of smiilar 
properties have also been isolated by Gong et al. (1979) and Olkcr and Sprcy (1990). 
However, although the molecular weight of the endoghicanase isolated by H&kansson and 
partially sequenced by St&hlberg in 1991 is laflier low, the molecular configuration of the 

30 enzyme is not advantageous as far as cnzymatical applKations are concerned. The molecule 
appears not to contain a linker domain and a cellulose bindii^g domain (CBD) but only a core 
domain. The cellulose binding domain and a linker region, allowing for its flexible separatwn 
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ftam the catalytic core, ait essential features of true cellulases capable of efficient auachmem 
to the substrate. 

The PCT AppUcation No. PCTAJS91/07276 discloses an endoglucanase enzyme. caUed 
i EGm derived from Trichodema. The molecular siie of the EGffl is 23 to 28 kDa. its pH 
optimmn is 5.5 to 6.0 and the pi 7.2 to 8.0. From the secjuence data of EGHI. it is apparent 
that said enzyme is the same as the one isolated by Htonsson a«l sequenced IV StiU 

and that it does not conuin the linker and CBD domains. 



10 



15 



30 



Known in the an are also smaU endoglucanase enzymes isolated from other microorganisms, 
•nms. a gene coding for a polypeptide homologous u, the short amino acid sequence av«lri>le 
fhHD the protein described above has been isolated from the fiingus AspergiUus aculeatus 
(Ooi et al. 1990). The PCT Patent Application No. PCT/DK91/00123 describes an 
endoglucanase derived from the fiingus Hmicola bisokns. The size of the polypeptide 
molecule is 43 kDa and its isoelectric pontt is 5.1. Tte use of die enzyme for treatmem of 



suggested. 



Nothing has so far been reported on die existance of a small size, true Trichodema endoghi- 
canase having ceUulose binding regions. It has been a general conception that the ceUulase 
20 system of 7Wd««tema consists of at least two CBHrs and two EGs and aA^^ 

EGm viiicb lacks a CBD. 

Isolation and manipulation of the cdhilase genes Is very important for the various commer. 
cial uses of enzymes and of the organisms producing them. Isolation of hydrolase genes from 
25 eukaryotes has been a task demanding eidier extensive snidies on the corresponding enzymes 
or the laborous differential hybridization protocols. 

Simimarv (f f invention 



It is an object of the presem invcmion to provide a novel endoglucanase enzyme of low 
molecular weight and having a suittble configuration for enzymatical appUcations. 
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This invention provides an endoglucanase enzyme derived from Trichoderma reesei which (in 
ungiycosylated form) has a molecular weight of about 20 to 25 kDa and contains 242 amino 
acids (the mature protein contains less amino acids that that depending on the signal sequence 
cleavage site), some 70 % of which are located in the core region, whereas roughly one sixth 
S of the aminn acids is in the linker, taking an extended conformation, and one sixth in the 

CBD domain. This distribution of the amino acid residues within the molecule gives evidence 
of it having an elongated, "woiroish" form in comparison to other cellulases, which 
facilitates penetration between adjacem molecules of fibrous cellulosic substrates. Being 
different in structure and activity, the enzyme complements the cellulolytic enzyme mixture- 
10 acting in synergy, as the Examples below will show. 

Another object of the invention is to provide a single and rapid method for isolation of 
endoglucanase genes by function. In fact, the method described in more detail below, makes 
it possible to isolate any hydrolytic enzyme gene, such as genes coding for cellulases (for 

IS instance endoglucanases and cellobiohydrolases) and hemicellulases (for instance xylanases 
and mannanases), without previous knowledge of the corresponding proteins. In this 
connection it should be pointed out that before this bvention there did not exist any data on 
protein level which would have suggested the existence of the novel endoglucanase described 
herein. This fact is ah^y indicative of the unusual properties resulting in its disregard in 

20 the biochemical characterization of the cellulase mixture produced 1^ Trichoderma. 

According to the presem method, an e^^ression cDNA library is made firom the organism of 
choice into a yeast expression vector. Yeast transformants are screened on plates containing 
the substrate of the desired activity. Using our earlier finding (Penttila et al. 1987, 1988) that 
25 yeast produces and secretes the major cellulases of T. reesei in active form, the enzymatic 
activities can be visualized on substrate plates. 

In the following description of die present invention, the novel gene coding for tiie novel 
endoglucanase ttaym is characterized as is its transfer into, and the expression thereof, in 
30 suitable hosts, such as fungi of the genus Trichoderma, in particular various Trichoderma 
reesei strains, and yeasts, such as Saccharomyces cerevisiae. 
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Tbe invention also provides yeast and fimgai strains aansformed with the gene coding for the 
novel endoglucanase enzyme. Finally, applications of the enzyme are suggested. 



5 



30 



pricf descrip ti^T drawings 



Fig. I shows the nucleotide sequence of the gene egl5 coding for the novel enzyme. EGV. 
Fig. 2A shows the ceUulosc binding domains and Fig. 2B linker regions of EGV compared 
with the same domains and regions of the other Trichoderma ceUutases. In Fig 2B, the serine 
and thcronine residues have been boxed. 
10 Fig. 3 shows the endoglucanase gene egl5 integrated into plasmid pAJ401 resulting in 

plasmid pAS4. 

Fig. 4 shows the endoghicanase gene egl5 integrated into plasmid pML016del5 resulting in 
plasmid pAS16. 

Fig. 5 shows the stnicture of plasmid pML016, 
15 Fig. 6 shows the structure of plasmid pML016dcl5(ll), 

Figs. 7a to 7d depicts the constiuction of iht egIS expression pUismid pALK956. Fig. 7a also 
indicating the stnicture of plasmid pAS13, 

Fig. 8 indicates die relative activity of die novel endoglucanase enzyme as a function of the 
pH. 

20 Fig. 9 shows the pH stability of the enzyme, and 

Fig. 10 shows die introns and codmg sequence of die egl5 gene. 

25 In the following description, the foUowiag abbreviations and definitions are used: 
^hhreviations: 



aa. amino acid(s); bp. base pairCs): CBD. ceUulose-binding domain: CBH. ceUobiohydrolase; 
cbh, gene coding for CBH: CMC. carboxymethyl cellulose: EG. endoglucanase: egl. gene 
coding for EG; HCA. hydrophobic cluster analysis; HEC. hydroxyethyl celhilose; kb. kilo- 
base(s); kDa, kUo dalton(s); MUC. 4-mcthyl-umbellifeyl fl-EM:eUobiosklc: MUL. 4-methyl- 
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umbelliferyl B-D-lactoside; NMR; nuclear magnetic resonance; PCR, polymerase chain 
reaction; PGK, 3-phosphoglycerate kinase gene of Saccharomyces cerevisiae; pi. isoelectric 

point. 
5 Definitions: 

Within the scope of the present invention, the term "ceilulase" is used as a collective tenn 
which encompasses enzymes catalyzing reactions which participate in the degradation of 
insoluble cellulose or cellulosic substrates to soluble carbohydrate. "Cellulase" is Icnown in . 

10 the art to refer to such a group of enzymes. As mentioned above, for hydrolysis of cellulose 
to glucose, three ceUulase enzymes (three types of cellulase enzyme activiiy) arc needed: ran- 
domly cleaving endoglucanascs (l,4,-/3-D-glucan glucanohydrolase, EC 3.2.1.4) which 
usually attack substituted soluble substtates; cellobiohydrolase (l,4-/5-D-glucan cellobiohydro- 
lase, EC 3.2.1.91) which is capable of degrading crystalline cellulose but has no activity 

15 towards derivatized ceUulosc and /3-glucosidasc 05-D-glucoside glycohydrolase, EC 3.2.1.21) 
which degrades cellobiose and cello-oligosaccharides to yield glucose. Each of the ttrcc main 
types of enzymes listed above occurs m multiple fonns. For example, two unmunolc^cally 
distinctive cellobiohydrolases, CBHI and CBHII are known. In addition, at least two distinct 
endoglucanascs arc known. Synergistic action between some of these enzymes has been 

20 demonstrated. "Cellulase activity" is synonymous with cellulolytic activity. 

Eozymcs having "endoglucanase activity" are, within the scope of the present invention, 
enzymes which will catalyse the hydrolysis of internal ^-l,4.linkages of cellulose. 



25 By "enzyme preparation" is meant a composition contaimng en^mnes which have been 

extracted from (either partially or completely purified from) the microorganisms (for instance 
the fungi) producing them. The term "enzyme preparation" is meant to inchide a con^sition 
comprising medium used to culture such microorganisms and any enzymes which the micro- 
organisms have secreted into such medhmi duriqg tiie culture. 

30 

"Culmre medium" denotes a medium previously used to culture a fungi ("spent" culnire 
medium), such culture medium containing enzymes which the ftingi have secreted into the 
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medium during the culluie. Hie culture medium can be used as such or as partially or 
conq)letely purified, concentrated, dried or immobilized. 

By -hybridization " are meant conditions, under which aH the differem forms of DNA 
sequences hybridize to the DNA sequence encoding for the rrictode/7«fl enzyme havi^ 

endoglucanase activity, the molecutar weight of the unglycosytated form of said enzyme 
being about 20 to 25 kDa and containing 242 amino acids (the mamre protein having less 
amino acids). 

•Gene" denotes a DNA sequence containing a template for a RNA polymerase. RNA that 
codes for a protem is termed messenger RNA (mRNA). 

It is well known that mutations occur in genes and can cause changes in die amino acid 
sequence of the encoded polypeptide. Changes can also be mtroduced by genetic engineering 
15 techniques. As used herem. the teim egl5 gene includes aU DNA sequences homologous with 
the sequence herein disclosed «or egiS and eDCoding polypeptides with the fiictional or 
stnictural properties of the Jrtwut 20 to 25 kDa polypeptide. It is known in the 

ceUulases lacking the linker and CBD regions still exhibit catalytfc activity towards the P- 
1 .4-glucosidic linkage, and thus a smaller core polypeptide is also hichided in the denotioii of 
20 eiw. Sequences artiffciaUy derived ihm this gene but stiU encoding a polypeptide with 
desired ftoctional or stiuctural properties are also inchided and encompassed by die e^^ 

''functional equivalents". 

A cloning vehicle or a vector is a plasmid or phage DNA or odier DNA sequence (such as a 
25 linear DNA) which provides an appropriate nucleic acid enviromnem for die transfer of a 
gene of interest into a host cell. TTie cloning vehicles of the invemion may be designed id 
replicate autonomously in prokaryotic and eukaryotic hosts. In Trichodema, the cloning 
vehicles generaUy do not autonomously replicate and instead, merely provide a vehicle for 
the transport of die gene of interest into die Trichodema host for subsequent insertion into 
30 die Trichodemu, genome. The cloning vehicle may be further characterized by one or a smaU 
number of endomiclease recognition sites at which such DNA sequences may be cut in a 
determinable fashion widiout loss of an essential biological function of die vehicle, and into 
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which DNA may be spliced in order to bring about replication and cloning of such DNA. 
The cloning vehicle may fiiither contain a maiker suitable for use in the identification of 
ceUs transformed with the cloning vehicle. Markers, for example, are tetracycline resistance 
or ampicillin resistance for E. coli and for example phleomycin resistance or acetamidase for 
5 Trichodema. The word "vector" is sometnnes used for "cloning vehicle. " Alternatively, 

such markers may be provided on a cloning vehicle which is separate from that supplying the 
gene of interest. 

A vehicle or vector similar to a cloning vehicle but which is capable of expressing a gene of 
10 interest which has been cloned into it. after transformation into a desired host, is caUed an 
expression vector. In a preferred embodiment, such expression vehicle provides for an 
enhanced expression of a gene of interest which has been cloned into it. after transformation 
into a desired host. 

15 The gene of interest which is provided to a fungal host as part of a cloning or expression 
vehicle integrates into the fimgal chromosome. Sequences which d«ive ftom the clonfaig 
vehicle or expression vdiide may also be integrated with the goie of intcaest during the 
integration process. 

20 The gene of interest may preferably be placed under the control of (i.e.. operably linked to) 
certain control sequences such as promoter sequences provided by the vector (which integrate 
with the gene of interest). If desired, such control sequences may be provided by the ftmgal 
host's chromosome as a result of the locus of insertion. 

25 A nucleic acid molecule, such as DNA, is said to be "capable of expressing" a polypeptide if 
U contains expression control sequences which contain transcriptional regulatory information 
and such sequetws are "operably linked" to the nucleotide sequence which encodes the 
polypeptide. 
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Bacterial cellulase genes have widely been isolated by transfonning genomic libraries into E. 
coli and screening activities on cellulose-containing plates (levicwed by Biguin et al. 1987). 
niis approach relies on the functionality of promoter sequences ftom other prokaryotes in E. 
coU and is not appUcable to eukaryotes. Furthermore, eukaryotic genes, such as the T. reesei 
EG V described here, contain introns which camiot be excised in E. coU and thus disturb the 
reading frame. Moreover, the TriAodenm ccUulases cannot generaUy be expressed in E. 
coU in active form even if expressed from cDNA coupled to bacterial expression signals. • 
TiaditionaUy fungal cellulase genes have been cloned using either differential hybridization, 
antibodies raised against the corresponding enzymes or hybridization with oligonucleotide 
probes based on the protein sequence of the enzymes (B6guin ct al. 1987). All these methods 
are laborous and demand a lot of time and previous knowledge of the corresponding 



enzymes. 



With the method according to the present invention, genes coding for new activities can be 
easily isolated without any previous knowledge of the protein. According to the invention, a 
fungal strain (e.g. Trichodema) is cultivated on a culture medmm which wiU induce enzyme 
production. Such culmre medium lypicaUy contains ceUulosic substrate, if endoglucanase 
20 production is aimed at After cultivation, the mRNA of the strain is isolated and the 

corresponding cDNA is formed. cDNA made from the organism of interest is cloned into a 
yeast vector to construct an expiession gene library in yeast, for instance Saccharomyces 
cerevisiae. Tte genes of the fungus are then expressed under any suitable promoter providing 

sufficient expression level, such as the yeast promoter P<X. The enzyme, e.g. 
25 endoghicanase. is extracellularly secreted and the colonies producing the desin^ 

e.g. the endoghicanase. can be idemified on the basis of their production of enzyme activity. 

Screening can be effected with activity plate assays. Thus, according to one preferred 
embodimem of the present invemion. the endoghicanase gene is isolated by plating the 
30 expression library onto plates comainmg barley B-glucan as substrate. After growth the ceUs 
are washed away and the plates are stained with congo red to reveal the hydrolysis halos. Up 
to 50 % of the clones giving halos may contain endoglucanase. The genes coding for 



Printed from Mimosa 04/26/2000 



I « 



PCT/n94/00234 

WO 94/28117 

10 

different endoglucanases can be identified by analyzing the doncs. 

n« expression gene library can also be constructed by using some other yeast promoter 
which wiU provide a weaker level of expression. If it is to be expected that the enzyme is 
5 dclcterions to the yeast, the inducible GALl promoter would be lecommendable. It is also 
possible to use the endoglucanase's own promoter and. for the purpose of isoteting the genes, 
a chromosomal gene library can. m some cases, be used. The gene library can also be 

constructed in a single copy plasmid. Also any other yeast strain with established trans- 
formation procedures can be used as a host, because their secretion capabilities are usually . 
10 even higher than that of Saccharomyces. 

In summary, the invention comprises the sl^s of 

. enriching the mRNA pool of a fimgal strain, e.g. Trichoderma, producing 

endoghicanase activity in respect of the mRNA of the endogtacanase by cultivatfaig the 
15 strain in conditions which wiU induce the endogtacanase production of said strain. 

. isolating mRNA from die stndn, 

- preparing cDNA corresponding to the isolated mRNA, 

- placing the cDNA thus obtained in a vector under the control of a suitable promoter, 
. transfonning the recombinant plasmids into a yeast strain which naiuraUy does not 

20 produce sigmficam ainounts of the endogtacanase in onJer to provide an expression 

libraiy, 

. cultivating the yeast clones thus obtained on a cultivation medium in order to express 

the expression library in the yeast, 

- separating the yeast clones producing endogtacanase from the otfier yeast clones. 
25 - isolating die plasmid-DNA of said separated yeast clones, and. 

. if desired, sequencmgtiie DNA in order to determine the DNA sequence codtag for 

the endoglucanase. 
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m gene eglS isola«d acconiing u, *c above ««hod was sequenced acconling to cooven- 
tional meflKKls. The DNA sequence of eg/5 is shown in Figure 1 and also indicated in SEQ 
5 ID NO . 1. 

m gene codes for a previously unknown protein of 242 am^ 
sequence of which is depicted in SEQ ID NO. 2. Interestingly, this protein contains the two 
conservative don«.ins found in aU Tnchoderm celtalases. namely the cellulose-binding ^ 
,0 donuun (CBD) «k1 the linkern^ion that connects the CBD to the catalytic co« donuun. The 
approxinuite regions comprising these donuuns are indicated i^ 
being the part of die sequence marked with the letter B. whe^as the ceMose b^ 

domain is marked with the letter A. Tlie putative N-glycosylation site is marked with an 
asterisk. At d« beginning of the protein a 17 amino acid lo«g signal sequence (Met-Lys-Ala- 
15 TT^-i^-Val-Leu-Gly-Ser-Leu-ne-Val-Gly-Ala-Val-Ser-Ala^ which is underiined in F««ie 1. 

can be predicted. If the signal sequence cleavage occurs at this poddon. the mamre protein 
consists of 225 amino acids and has a calculated molecular weight of 22.799 KDa. 

Tte core of the endoglucanase is separatdy depicted in SEQ m NO. 3. It wodd a^ 
the core of d« novel endogkcanase is primarily responsible for the ceUulplytic activity of the 
novel enzyme. Tims, it is conceived (hat an endogtaeanase enzyme product may m pnncple 
comprise the polypeptide of the core domain ody. However, the surprising cnzymaoc 
p^es described below are probably attribmable to . combination of the above three 
„g|ons and domains, and they wiU therefore best be obtained if the protein composes aU 

25 three pans. 

It is believed that the predicted 17 aa signal peptide indicated in Figure 1 «« be substituted 
bv «u«her suitable signal peptide possibly of a diffeiem length. Such a s^ 
Should typically comprise a positively charged amino acid at the begimnng foUowed by a 
30 s,xeu:h of hydrophobic amino acids. Dependh« on d« signal sequence cleavage s.te .n wvo 
and the possible proteolytic processing occurring frequendy in ceUulases. *e molecular 
weight of d« active polypeptide may vary somewhat and tbc «»vel endogl««uu« » 



20 
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therefore referred to as having a molecular weight in ungiycosylated fonn of about 20 to 25 
kDa. 

In its 0- and N-glycosylaled form the enzyme can be significantly bigger having apparent 
5 molecular weights of 35 kDa or even much higher when produced in the yeast 

Saccharomyces, Furthermore, ceUulases frequently undergo drastic proteolytic cleavage 
which removes the CBD (and linker) regions and consequently the size of EGV in fungal 
culture medium can be even about 115 kDa in ungiycosylated form. 

10 While modelling protein coniformations firan fust princq>lcs is not possible, the high 

sequence similariQ^ between fungal CBDs warrants the consttuction of a homology model 
(Sali et aL, 1990). The feasibility of modelling side chain confonnations has been 
demonstrated in similar cases (Bhmdell et al„ 1988. Heiner et al, 1993). Modelling of the 
EGV CBD revealed some interesting differences compared to the known stracture of the 

15 CBHI CBD. This wedge-shaped domam seems less sharp in EGV and there are some 
differences in main cham and side chain conformations and in hydrophobic properties in 
areas known to be important for binding of the CBHI CBD onto the cellulose surface or for 
the full activity of the CBHI enzyme against crystalline cellulose. Preliminary bindmg data 
indicate that the EGV CBD is able to bind to cellulose. 

20 

The protein belongs to a new family K of ceUulases together with the endoglucanase B of 
Pseudomonas fluorescens and the endoghicanase V of Humicola insolens as studied by hydro- 
phobic cluster analysis by Hcnrissat and Bairoch (1993). This strongly suggests flmt EGV is 
Structurally different from all Trichodema ceUulases characterized so far. Based on this, it 
25 would also appear that there are catalytic differences between the present enzyme and the 
other ceUulases. The fact that EGV is a true endoglucanase was confirmed by 'H-NMR 
spectroscopy, which showed that the internal /5-1,4-linkagcs woe hydrolysed by EGV when 
barley 0-glucan (a soluble glucose polymer containing ff-1,4- and /S-l J-linkages) was used as 
substrate. 

30 

Thus, for instance, as evidenced by Example 11 below, the novel endoglucanase appears to 
work synergedcally with the known endoglucanase EGII on hydroxyethyl ceUulose. 
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^p^ ppession t jif tt^e pene eelS 

Once the vector or DNA sequence containing the constiuct(s) is prepared for expression, the 
DNA constnict(s) is introduced into an appropriate host ceil by any of a variety of suitable 
i means, including transformation as described above. After the introduction of the vector. 
recipi«int ceUs are grown in a selwtive medium, which selecu for the growth of transformed 
cells. Expression of the cloned gene sequence(s) results in the production of the desired 
protein, or in the production of a fragment of this protein. This expression can take place in 
a continuous manner in the transformed cells, or in a controUed manner. 



10 



Expression of the gene can be obtained in any fungus with developed transfoimation and 
expression methods. . 

Trichoderma is an especially useful and practical host for the synthesis of the enzyme 
15 preparations of the invention because Tnchodema is capAle of secreting protein at large 
amounts, for example, concentrations as much as 40 grt- cultnie fluid have been reported; 
the homologous Trichoderma cbhJ promoter provides a very convenient promoter for 
expression of genes-of-interest because it is a strong, single copy promoter which normally 
directs the synthesis of up to 60 % of the secreted protein ftom the Trichoderma host; the 
20 transformation system b highly versatile and can be adapted for any gene of interests 

Trichoderma host provides an "animal ceU type" high mamiose glycosytotion pattern; and 
culmre of Trichoderma is supported by previous extensive experience in industrial scale 
fermentation techniques. In addition, several promoters active on glucose mednm can be 
used, which enable the production of the enzyme essentiaUy free from other ceUulases. 

25 

Expression of the proteto m the Trichoderma hosts requires the use of regulatory regions 
functional in such hosts. A wide variety of transcriptional and translauonal regutatory 
sequences can be employed, since Trichoderma generally recognize eukaiyotic host 
transcripuonal comrols. such as. for example, those of other filamemous fimgi. Such control 
30 regions may or may not provide an initiator methionme (AUG) codon. depending on whether 
the cloned sequence contains such a methionine. Such regions will, in general, inchide a 
promoter region sufficiem to direct the initiation of RNA synthesis in the host ceU. 
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According to the invention the DNA sequence encoding EGV can be transformed into 
Trichoderma and expressed, for example under the strong cbhl promoter, as described in 
EP-A 244,234 and US 5,298,405. or other promoter funcdonal in Tridwderma. The DNA . 
sequence coding for EGV can be integrated into the general expression vector pAHMllO. 

5 n»c transformation can be done as a cotransfonnation using two circular plasmids. the 

seleaion marker being located in one of the plasmids and the DNA sequence encoding egl5 
to the other, or the selection marker and the DNA sequence encoding the e«i5 can be located 
to the same plasmid, or linear fragments can be used in the transfonnation. Possible selection 
markers are. for instance. trpC or argB from Aspergillus mduUms or argB or pyr4 from 7.. 

10 reesei or amdS from A. rddulans or trpl from Neurospora crassa or phleomycine or hygro- 
mycine resistance markers from bacterial origin (EP-A 244.234, US 5.298.405, and EP-B 
539.395 and Ulhoa et al.. 1992. Transformation of Trichoderma species with dominant 
sele^ble markers. Curr. Genet 21:23-26) or other selection maricer shown to fimction in 
Trichoderma in fiittire (Karhunen et al. 1993, High ftequency one^rtep gene replacement in 

15 Trichoderma reesei I. Endoglucanase I overproduction, MGG. 241: 515-522. and Suominen 
et al.. 1993. High frequency one-step gene replacement m Tridioderma reesei n, Efftets of 
deletions of individual cellulase genes. MGG. 241: 523-530. 

To construa a Trichoderma strain producing endoghicanase V as the main celhilolytic enzy- 
20 mcitispossibletoconsinict2>u*ode/iBflstrafaisdiatdonotp^^ 

and n or an other cellulolytic enzymes: endoghicanase I and H and ceUobiohydrolase I and 
n. The desired cellulolytic genes can be made deficient (EP-A 244,234. US 5.298.405. 
Karhunen et al. (1993) and Suominen et al. 1993). If genes are expressed under the cbhl 
promoter the expression is repressed by glucose and thus the strains must be grown on 
25 cellulose-containing medium. 

Alternatively, it is possible to construct Trichoderma strains expresshig EGV under glucose 
promoter, nris means that the Tridioderma strains expressing EGV can be grown on glucose 
contaming medium. Possible ghicose promoters are. for example, glucose derepressed cbhl 
30 promoter of the plasmid pML016del5(ll) (« al. . 1992) and the promoter of the cDNAl 
gene (Nakari a al. 1992) or other ghicose promoters. 
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Acconiing to the invention. the« is also provided a method for producing in fungal and yeast 
hosts such as the yeast Saccharvn^ces and filamentous ftingi. such as Trichodenna. an 
enzyme preparation having an endogta«m.se «:tivi.y stemming from an endogh^anase 
enzyme. d« molecular weight of which (in unglycosylated form) is 20 to 25 kDa. 

* 

Further if desired activities are present in more than one recombmant host, such 
prepanitions can be isolated fh«n the appropriate hosts and combined pri^ 

method of the invention. 
10 Pnryp^ preparations 

To obtrin the em:ymc piepaxations of the mvention. containing elevated levels of the EGV. 
Ae recombinant hosts described above baying the desired properties (that is. hosts capable of 
expressing the novel endoglucanase em^yme) «e cultivated under suitable conditions (cf. 
above), the desired eniymes are sec^ted fhm the host into the culoire medto^^ 

e^ymc pitpanttion is recovered from said culnm: medim 

AS mentioned above, the enzyme prepar«ioB can be produced by cultivating the fiingal strain 
in conditions where the regulatory legions directing endoglucanase expression are openumg. 
,„ch as on a gh«»st<ontaining medium if the yeast or Wc^^ 
amused Ttos. if endogtacanase V is expressed under glucose piomoier. the 2>^^ 
strains can be grown on. e.g.. glucose minimal medhmi (PenttM 
glucose containing medhm.. for exmple Bacto-Peptone 5 gA. Yeast exnrac. 1 gA. KH,PO 
g/1. (NH.)^0, 4 gA. MgSO, 0.5 g/i. CaO, 0.5 gA and trace element FeS0.-7H,0 5 «g/l. 
MnSO,.HP 1,6 mg/1. ZnS0,.7H,0 1.4 mg/l and CoCU.6H,0 3.7 mg/1. pH 5.0 - 6.0. 

Tl« enzyme can be pnxluced al«, in other condidons. such as on Solca floe 
rnctodeniw cfeW promoter is used, or on a galac«»e^ntaining medh^ 
galactose-inducible promoter is used. The ceUulose-contaming cultivation medtam may. for 
Lance, comprise. 6 % Solca floe cellulose (BW40. James River Corporation^ Hacke.^. 
NJ). 3 % distiUer-s spent grain. 0.5 % KH,PO.. 0.5 % (NH4)^a. and 0.1 % sm-ctol as an 
antifoaming agent (stiuktol SB 2023. Schill & Seilacber. Hamburg. FRG). Tnduxlema 
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Strains aic sensitive to glucose repression and require an inducer (cellulose, lactose or 
sophorose). The pH should picferably be kept at approximately pH 5 to 6 by the addition of 
phosphoric acid or anunonia and the temperature at 30 during the cultivation. 

5 The enzyme preparation is recovered from the culoirc medmm by using methods well known - 
in the art. However, the enzyme preparations of the invention may be utilized directly from 
the culture medium with no further purification. If desired, such preparations may be lyophi- 
lized, immobilized or the enzymatic activity otherwise concentrated and/or stabilized for 
storage. 

10 

If desired, the expressed endoglucanase protein may be further purified in accordance with 
conventional conditions, such as extraction, precipitation, chromatography, affinity chromato- 
graphy, electrophoresis, or the like. 

15 Applications of the novel enzvme 

The catalytic core of the novel enzyme is the smallest of fungal or bacterial cellulases 
characterized. Therefore the enzyme and the enzyme preparations according to the invemion 
have application in the treatment of pulp and paper and in the textile industty. Furthermore, 

■ 

20 the enzyme can be used in the fodder industiy. The properties of the novel endoghicanase are 
unexpected for a endoglucanase on basis of general knowledge. 

Being a 0-glucanase, the novel enzyme can be used for hydrolyzation of the 0-glucan of 
barley. As a result, the viscosity of the fodder is lowered and the nutritional vahie of the 
25 fodder is inq>roved. 

As evidenced in Example 8, the pH optimum of the enzyme is higher than those of the other 
endoglucanases produced by strains of the species Trichoderma. This favorable pH range can 
be utilized m many ways. One preferred application is for removing colour from denim 
30 jeans: in acidic pH. reabsorption of the colour occurs, but at neutral pH there is much less 
reabsorption. Another preferred embodiment comprises deinking. Normally, the pH of a 
slurry of water and newsprint is about 5,5 to 6.0 and tiierefore the novel enzyme can be used 
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without any need for adjustment of the pH. On the other hand, coated paper contains fiUers 
and pigments which wiU raise the pH of an aqueous paper slurry formed therefrom. If the 
pH of the slurry is lowered by adding mineral acid, at least some of the suspended or 
dissolved fillers and pigments may precipitate, e.g. in the form of calcium sulphate. 

5 

The small size and the advantageous pH range of the novel enzyme make it possible to use it 
for treating recycled fibre in order to in^rove the technical properties thereof. Hic en^rme is 
also a|)plicable for improving pulp drainage. 

10 The invention is described in more detail with the aid of the following non-limiting 
examples. 

In the exanqjles. the foUowing strains and vectors woe employed: £. coU strains PLK-F, 
pBluescript SK'. and XL-1-Blue (Strattgene) were used as hosts for plasmids and PLK-F* as 

15 a host for Ae cDNA libraiy. The following plasmids were used: pASll. pASl3. pALK487 
and pALK183. The T. reesei strain QM9414 was used as a source of RNA for cDNA 
pttparation and Northern analysis. T. reesei ALK02221 and ALK03524 were used as hosts 
for EGV expression. S. cerevisiae strain DBY746 (« m. 1 Iai2-3 lai2-112 aBl-289 
syb' HI*) was used as a host for the expression libraiy. Strain MD40-4c (a is^ BSl lQi2-3 

20 ieu2-112 his3-ll his3-15) was used as a host for the plasmids pMP311, pMS3, pMPll and 
pMP29 carrying egll, egl2, cbkl and cbh2 genes of T. reesei, respectively (Pentliia et al. 
1987. 1988). The yeast expression vector pFL60 (Minet and Lacrouie 1990) containing the 
constitutive yeast PGK promoter and terminator, UIU3 maricer gene and the 2 micron 
plasmid replication origin was kindly provided by Dr. M. Minet. Centre de Ginitiquc 

25 Mol6culaire. C.N.R.S., France. 
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Example 1 

Isolation of endoglucanase goie by expression in yeast and hydrolytic properties of the 
yeast 

5 r. reesei strain QM9414 was cultivated in a 10 liter fermentor at 28 and pH 4.0 for 42 
hours. The cultivation medhun used to induce hydrolytic enzyme production contained 2 % 
Solka floe ceUulose. 1 % distiller's spent grain. 0.2 % Locust bean gum -galactomannan 
(Serva), 0.5 % KH2PO4 and 0.5 % {im^^O^. After 42 hours of growth, lactose (Sigma). 
Birke 150 acetylglucuronoxylan and Oat spelt arabinoxylan were added in an amount of 0.1 . 
10 % each and the cultivation was continued for further 24 hours. 

Total RNA from the J. reesei strain was isolated as desoibed by Chirgwin et al. (1979), and 
the poly(A)* fraction was separated by chromatognq>hy through oligo(dT)-ccUulose (BRL). 
cDNA. synthesized by the ZAP-cDNA synthesis kh (Stratagene), was ligated to the £coRI- 

15 Ztol cut plasmid pAJ40L Plasmid pAJ401 vras derived from plasmid pFL^ (Minet and 
Lacroute 1990) by changing the two cloning sites EcdSL and lOiol between the yeast PGK 
promoter and tetminator into the reverse orientation. Transformation of £. coU strain PLK-F* 
by electroporation (Bio-Rad) according to the manufacttirer's instructions yielded a library of 
3.5 X 10^ independent clones. Plasmids were isolated from the pool of £. coU transfoimants 

20 and transformed into 5. cerevislae strain DBY746 by electroporation (Bio-Rad) according to 
the manufacnuer's instructions. Electrqxyration with 7 /tg of plasmid DNA yielded a library 
of 8 X 10* yeast transformants. 

1.2 X Itf yeast cells were plated on barley B-glucan-containing plates to a density of 2000 
25 colonies / 85 mm plate and grown at 30 for 3 days. Colonies were replicated and the 
original plates suined with Congo Red. Unstained areas around yeast colonics indicate 
hydrolysis of the substrate to oligosaccharides. Colonies showing activity were picked up 
from the replica plates and purified on new activity plates. Plasmids were recovered from die 
purified clones and analysed by restriction enzyme digestions. 20 clones gave a similar 
30 pattern of bands which was clearly different from the earlier isolated cellulase genes of T. 
reesei. 
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Transfonnation of tte plasmids back to yeast confiimed that the activities were caused by the 
cDNA inserts. One of these plasmids. pAS4 (cf. Figure 3). was studied fiirther. THe inscn in 
the pAS4 plasmid was named egl5 and the corresponding protein EGV. 

5 egl5 cDNA was sequenced from both strands of the original pAS4 plasmid using the Sanger 
dideoxynudeoiide method. T7 DNA polymeiasc (Pharmacia) and sequence specific primers. 

The sequence obtained is shown in SEQ ID NO. 1. 

10 The chromosomal egl5 gene was isolated from a T. reesei cosmic library (MSntyli. A. et al. 
Curr. Genet. 1992. 21 471-477) by using the egl5 cDNA as a probe. About 6 kb HinUm 
fragmem was subcloned to pBluescript SK'. resulting in plasmid pAS13 (Fig. 7a), The 
introns and coding sequence of egl5 gene are shown in Figure 10 (SEQ ID NO. 11). 

15 TT,e activities of the yeast strain DBY746 carrying the pAS4 plasmid were smdied by plate 
assays and they were compared with the activities of the yeast strains producing CBHI. 

CBHn, EGI and EGII. 

Hydrolytic activities produced by recombinant yeast cells were detected on SD plates con- 
20 taining 0.1 % barley ^-ghican O.D-1.3-1.4.glucan. viscosity 20-30 c.s.; Biocon. UK. 

Sherman 1991) or hydro^ethyl cellulose (HEC. Huka. Switzerland, product 54290). After 

growth the plates were stained with Congo Red (Merck) as described by Penttfli et al. 

(1987) to reveal the hydrolysis halos. Xylanase activity plates containing 0.2 * of a Remazol 

BriUiant Bhie-dyed derivative of xylari (RBB-xylan. Sigma) needed no fimher treamiem. 
25 Activities against synthetic substrates. 4-methylumbelliferyl ^-D-ceUobioside (MUC; Koch- 

Light, UK) or 4.methylumbellifciyl (3-D-lactoside (MUL; Lambda Probes & Diagnostics. 

Austria) were detected as described by PentriB et aL (1987). 

Tte EGV protein showed a clear activity against /J-glucan but tiie activity was lower tiian tte 
30 activities of ibc strains producing EGI. CBMB or EGD (Table). However. U>e expression 
tevels and tte secretion efficiencies of foreign proteins in yeast may vary and tinis it is not 
possible to draw any definite conclusions concerning the level of enzyme activity. Also, tiie 
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pH on the plates is not optimal for EGV function, EGV shows some activity against 
hydroxycthyl ceUulose (HEC) in plate assays. No activity was detected on plate assays 
towards RBB-xylan or the small synthetic substrates, mclhylumbelliferyl ccllobiosidc (MUC) . 
or methylumbellifeiyl lactoside (MUL). 

5 

Table 1 Hydrolytk activities of the yeast strains canying the cellulasc genes of 

Trichoderma reeseL The extent of hydrolysis of the substrate was estimated 
visually and is indicated by + . 

10 
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Examide 2 

Construction rf endoglucanase expression Tectors whh truncated fragments of the chbh 
promoter 

The vector pML016 (Figure 5) contains a 2.3 kb d>hl promoter fragment (SEQ ID 4) 
starting at 5* end from the EcdBl site, isolated from chromosomal gene bank of Trichoderma 
reesei (Tecri ct al, 1983), a 3.1 kb BamOi fragment of the lacZ gene from plasmid pAN924- 
21 (van Gorcom ct al., 1985) and a 1.6 kb cbhl terminator (SEQ ID 5) starting from 84 bp ^ 
upstream from die translation stop codon and extending to a BarnHl site at the 3' end (Shoe- 
maker ct al. 1983; Teeri et ah. 1983). These pieces were linked to a 2.3 kb lor« £coRI- 
PvuU region of pBR322 (SuiclifTe, J.G.. 1979) generating junctions as shown in Figure 5. 
The exact in frame joint between the 2.3 kb cbhl promoter and the 3.1 kb iacZ gene was 
constructed by usmg an oligo depicted in Figure 5. A polylinker shown in Figure 5 was 
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cloned into the single inienial fflii site in the chbl promoter for the purpose of promoter 
deletions. A short Saa linker shown in Figure 5 was cloned into the joint between the 
PBR322 and cbhl promoter fragments so that the expnasion cassette can be released from 
the veaor by restriction digestion with Saa and Progressive unidireaional deletions 
were introduced to the cbhl promoter by cutting the vector with and »k»I and using the 
Erase-A-Base System (Promega. Madison. USA) accoidign to manu&cnirer's instructions. 
Plasmids obtained from diffcrom deletion time points were transformed into the £. coU strain 
DH5a (BRL) by the method described in (Hanahan D. 1983) and the deletion end points 
were sequenced by using standard methods. 

Examples 

Construction of vectors for expnssion of EGV in Wchoderma in giucose^ontaining 
medium 

In Older to produce EGV protein in rricAodeniM /w« QM1W14 strain esscn^ 
other ceUulases in a medmm comaining glucose, the phmnid pAS16 (Fig. 4) was construc- 
ted, niere, the egl5 cDNA was cloned under the truncated, ghicose dciepressed cbhl promo- 
ter of the plasmid pML016del5(ll), generated as explained in Example 2. Tbt plasmid 
contained a 1110 bp deletion in the cfcW promoter beginning from the promoter internal 
polylinker and ending 385 bp before the translation initiation site (Fig. 5). The sequence of 
this truncated promoter is provided as SEQ ID NO. 6. Plasmid pML016deI5(ll) was diges- 
ted with the restriction eniymcs ^ and Snud. Tbt vector part comaining the glucose- 

derepressed cbhl promoter, the cbhl terminator and the pBR322 sequence w« blunt-ended 
with the Mung bean nuclease, dephospboiylated with Calf imestin alkaline phosphatase and 
ligated to the egl5 cDNA fragmatt. 



30 



•nw yeast expression plasmid pAS4 was digested with £a,Rl and paitiaUy with XJtol to 
isolatt the fuU-length eglS cDNA. The ends of the cDNA were filled-in with the Klenow 
polymerase enzyme and the fragment was ligau:d imo die SiuilHcleaved vector pSP73 (Pro- 
mega). The resulting plasmid pASll was digested with EcoJa and Xbal. filled-in with the 
Klenow polymerase and ligated to the vector part of the expression vector pML016dcl5(ll). 
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Twenty micrograms of the pAS16 plasmid were digested with EcoBl and Sphh phenol- 
extracted, precipitated and transformed into Trichoderma reesei QM9414 together with three 
micrograms of the plasmid p3SR2 (Hynes et ai, 1983) containing die acetamidase gene 
according to Pcnitila et al, (1987). 

■ 

The promoter of the cDNAl gene (Nakari et aL. 1992) was also used to direct the synthesis 
of the EGV piolein on glucose-containing medium. 

The promoter of the cDNAl gene was cloned from the chromosomal DNA by PCR using the 
5'primer GC3T CTG MO OAC GTO QAA TOA TOO (SEQ JD NO. 7) and the 3'primer oat oca 

TCG ATC fiT C CQC GO G TTC AGA GAA GTT OTT OGA TTO ATC AAA AAG (SEQ ID NO. 8). The 

underlined ATCGAT.in the 3'primer is a C/oI site and the CCGCGG a Kspl site. 

The egl5 cDNA and the cbhJ terminator were cloned as one fragmem firom the plasmid 
pAS16 by PCR using the 5'primer oao AOAjaaLCfifl-TOA tct tcc atc tco tot ctt OCT tot 
AAC (SEQ ID NO. 9) and the 3'primer atc gto oat oca tta tta aca ctt coo too (SEQ ID 
NO. 10). The underUned CCGCGG in the 5'primer is a Aipl site. 

Eight micrograms of both of the fragments were digested with the Kspl enzyme, purified from 
( agarose gd and Ugated. The Ugation mixture was ejcBBCted with phenol, predpita^ 

instead of a plasmid in the Trichoderma transformation together with three micrograms of the 
p3SR2 plasmid. 

TTie Amd* transfoimants from the pAS16 transformation were streaked twice onto plates 
; conuuning acetainide (Penttiia et al.. 1 987). and then cultivated on Potato Dextrose Agar 
plates (Difco) from which spore suspensions were made. EGV production was tested from 50 
ml shake flask cultures carried out in minimal medum according to Penttiia et aL. (1987) 
except that the amount of glucose was 4 %. KH,PO, 3 %, KjK>, 0.8 %. (NHJ^SO^ 0.2 % 
and the medium was supplemented with 0.2 % peptone. Glucose was added as 15 % solution 
) when necessary to keep the level above 1 % during the whole four days of the cultivation. 
The culture supematants of 55 nransfonnants were analyzed for activity against barley p- 
glucan by the DNS-method (Zurbriggen et al., 1990). 
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The spore suspensions of the three best EGV-producing clones (numbers 101, 79 and 19) 
were purified to single spoie cultures on Potato Dextrose Agar plates. EGV production was 
analyzed again from these purified clones as described above. The best producing transfor- 
mant 101c was analysed by Southern blotting using conventional methods and the presence of 
i the expression casette in the genomic DNA was confirmed. Northern analysis showed that the 
eg/j gene was eiqjressed from fte constructs on glucose medium. 



Example 4 

10 Construction of EGV expression plasmid pALK956 
The expression plasmid PALK956 (Figs. 7d) contains: 

1) T. reesei egl5 gene fused to the cbkl promoter. A fiagment containing the cbhJ tenninator 
15 was included after cg/i to ensure stop in the transcrqjtion. 

2) £ coli hph (hygromycin B phosphotransferase; Gritz and Davies. 1983) as a marker gene 
for transformation. TTie gene was expressed from the T. reesei pki (pyruvate kinase; Schin- 
dler et td.. 1993) promoter. 

3) Elongated c6W terminator as a flanking region to ensure stop in p*i transcription and to 
20 target the expression cassette, together with the c6W promoter fraginent. to the c6W 

The construction of pALK956 is shown in detail in Figs. 7a - 7d. For the construction, the 
plasmids pASll, pAS13, pALK487 and pALK183 were used. The plasmid pASll contains 
the egl5 cDNA (Fig. 1) and pAS13 contains the chromosoinal eglS gene (Fig. 10). The 
25 plasmid PAIJC487 contains the r «e«f cM/ promoter (the 2.2 kb Stol - &icU fragment 

originally from the plasmid pAMHUO; Nevalainen « al.. 1991) and cbhJ terminator (the 0.7 
kb AvaU fragment starting 1 13 bp before the stop codon of the cbhJ gene; for the ebhl 
sequence, see Shoemaker et al, 1983). The plasmid pALK183 contains gene under the 
control of the pki promoter. It was constructed from pRLM^O (Mach et al., 1994) by 
30 changing the cbh2 terminator to 1.6 kb cbhl elongated terminator (.4vflII - BamWl fragment). 

The exact fusion of the egl5 gene to the cbhl promoter was done by PGR. The 5'-primer 
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contained the last 26 nucleotides of the cbhl promoter including the SacII site and the first 18 
nucleotides of the coding sequence of egl5 (5'^AATAGTCAA£Cfi£eeACTGCGCATCAT- 
GAAGGCAACTCTGGTT; the SacU site is underlined, egl5 sequence is bolded). The 3'- 
primer contained 21 nucleotides of egl5 sequence, including the BamHl site about 0.7 kb from 
5 the beginning of egl5 sequence (5' -GOOCGTGfiSAieeGTCTCTTG; the BamHl site is 
underlined). The plasmid pAS13 was used as a ten^late in the PGR reaction. 

The 0.7 kb PGR fragment (filled in with DNA polymerase I Klenow fragment and cut with 
Bamm), containing the exact link between the cbhl promoter and the egl5 gene, was ligated 

10 to PvuU - Bamm digested pASl 1 to obtain pALK951. The fiision and the PGR fragment 
were sequenced to ensure that no mistakes had occuncd m the PGR an^lificadon. Plasmid 
PALK955. containing the flision of &e eglS to the cbhl promoter, was obtained by ligating 
£coRI/Klenow - SbcII fragment from pALK951 between the cbhl promoter and tenninator in 
the plasmid pALK487 (BamHI/Klenow - SadS). The ApA maA« gene (under 4e control of 

15 the pki promoter) and the cbhl 3 '.flankmg region (elongated terminator) were Ugated to SnA 
cut pALK955 from pALK952 (JStol - MrmflU fragment / Klenow) to construct pALK956. 

The plasmid pALK952 was constructed from pALK183 by shortening Aep*< promoter 
compared to the promoter used in pALK183 and pRLKWO (Afofl/partial - XhoU Klenow). 



20 



The 7.4 kb expression cassette from pALK956 can be removed with Noil digestion. 



Thus, in sunmiaiy. in the expresaon plasmid pALK956, the eg/5 gene is fused to the cbhl 
promoter. The E. coli hph (hygtomycin B phosphotransferase) gene is used as a marker for 
25 the transformations. The cbhl 3 -flanking region (elongated terminator) is included to ensure 
stop in the />iW transcription and to target the expression cassette, together wi* the promoter 
fragment, to the cbhl locus. 

Example 5 

30 Expression of EGV under the cbhl promoter in ceUulase-indndng mcdnim 

The EGV expression plasmid, pALK956, was digested with Notl, and the 7.4 kb fragment 
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was purified from agarose gel. 2-3 w of the linear fragment was transformed into T. reesei 
sttains ALK02221 and ALK03524 according to Penttilil et al. (1987) with the modifications 
described in Karhunen et al. (1993). ALK02221 is a low protease mutant from T. reesei 
VTT-D-79125 (Bailey and Nevalainen, 1981). prepared in our laboratory (A. Mtatylft). 
5 ALK03524 is a stmin derived from VTT-D-79125. where the cbh2. egl2 and egll genes have 
been deleted using the A. nidulans trpC (Yelton et al.. 1984). A. rudulans amdS (KeUy and 
Hynes. 1985) and Strepioalhieichus Mndustanus pUecT (Mattem et al., 1987) marker genes, 
respec^vely. The method of one-step gene replacement with a. linear fragment and flanking 
regions of the corresponding cellulase locus is described in Suominen et al. (1993). 

HygB+ transformaits were selected on plates containing T. reesei minimal medium (Penttilfi 
et aL. 1987) with 100 ^g hygromycin/mL Transformants were purified by single spore 
selection on selective medium and then cultivated on Potato Dextrose Agar. Purified 
transformants were grown in shake flasks in a medium containing 4 % whey. 1.5 % complex 
nitrogen source derived from gnrin, 1.5 % WO. -nd 0.5 % (Nl^SO,. Cultures were 
maintained at 30 'C and 250 rpm for 7 «»ays. !!» culture supematants were analyzed for 
activity against barley p.gh»an at pH 6.3 by the DNS-method (Zurbriggen et al. 1990). 
Soluble protein was assayed by the method of Lowry et al. (1951) using bovine serum 
albumin as standard. The detection of the 67 kDa CBHI protein was done in SDS-PAGE 
followed by Coomassie Brilliant Blue staining. Tte results from the best EGV transformaffls 
aiul the corresponding host strains are shown in Table 1 In the EGV-transforma^ 

P-glucanase «rtivity measured at the optimum pH of EGV was enhanced about twofald. 



10 



15 



20 
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Table 2. Expression of EGV under the cbhl promoter in cellulase-inducing medium 



10 



15 



20 



Strain 



Protein p-giucanasc CBHI 

(mg/ml) activity (BU/ml) (+/-) 



ALK02221 7.1 1945 + 

EGV/ALK02221/11 5.2 3262 + 

EGV/ALK02221/47 5.5 3236 

EGV/ALK02221/31 3.5 3757 

EGV/ALK02221/68 3.9 3479 + 

ALK03524 9.3 3338 + 

EGV/ALK03524/27 5.3 6770 + 

EGV/ALK03524/28 7.6 7588 + 

EGV/ALK03524/31 7.0 «22 + 



Example 6. 

Enzyme preparation containing EGV proteto obtained from yeast where eglS gene 
25 expressed. 



Saccharomyces ceremiae DBY 746 containing tbc pAS4 plasmid was grown in a bioreactor 
(Chemap LF 20. working volume 16 1) on a standard YPD medium. Tbe inoculum (5 times 
30 200 ml) was grown in shalie flasks in selective synthetic complete medium without uiacU. 

Cultivation conditions were: temperamre 30 'C. pH controlled between 5.2 and 5.9, aeration 
about 1 5 1 min ' and cultivation time 45 h. The yeast cells were separated ftom the medium 
by centriftigation and tiw culture supernatant was concentrated 4-fold by ultrafiltration (PQ 
ES 625 membraixs). 



35 



The enzymatic activity in the concentrate was assayed by sumdaid methods using appropriate 
incubation times for tiw enzyme reaction against ^ucan (Zurbriggen et al.. 1990a) and 
hydroxycthyl ceUulose, HEC (lUPAC. 1987). The p-glucanase activity was 0.7 nkat ml" and 
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endoglucanase (HEC) activity less than 0.4 nkat ml '. No endoglucanase activity could be 
detected in culture filtrates of control cultivations of yeast missing the EG V gene (5. 
cerevisiae DBY 746 carrying the plasmid pAJ 401). 

t 

5 The purified EGV preparation can be obtained from the ultrafiltiation concentrate by standard 
protein chromatography methods, The EGV protein can be bound to an anion exchange resin 
(e.g. Mono Q columns or DEAE Sepharose FF, Pharmacia) in low ionic strength buffer and 
at appropriate pH. The protein can be eluted out of the column using increasing gradient of 
NaCl (e.g. from 0 to 0.5 M in the buffer of bmding). Alternatively, the impurities from the • 

10 preparation of EGV can be removed by binding them in anion exchange resin at appropriate 
pH and ionic strength where EGV is not bound to the resin. Cation exchanger resins (e.g. 
Mono S columns or CM Sepharose FF, Pharmacia) can be used in analogous way by selecting 
buffers of appropriately low pH (e.g. pH 4 - pH 6). EGV can also be purified by gel 
permeation chromatography where it can be separated due to its smaU molecular size. The 

15 columns of various materials (e.g. Sephaciyl S-100 HR or various types of Sepharose and 
Superose. Pharmacia; Fractogel TSK HW-55, Merck) in e.g. phosphate or acetate buffers 
containing e.g. 0.05 - 0.5 M NaCl can be used. Hydrophobic interaction chromatography and 
various affinity chromatography methods may also be used. 

20 

Example 7 

Enzyme preparation containing EG V protein obtained from Triehoderma reesd grown 
on glucose. 

25 One of ti»e best producing T. reesei QM9414 transformants (number QM/lOlc) was grown in 
a bioreactor (Chemap LF 20, woricing volume 16 1) on a medhan of Mandels and Weber 
(1969) where Solka floe cellulose (10 g l ') was replaced by 20 g r' of glucose and where 
the concentrations of ofljer nutrients were conespondively doubled. The inoculum (5 times 
200 ml) was grown in shake flasks in a medium containing 40 g I ' glucose and the adequate 

30 mineral salts for nutrients and buffering of tiie medium. Cultivation conditions were: tempera- 
wre 29 'C. pH controlled between 4.0 and 5.0, aeration about 15 1 min ' and cultivation time 
93 h. During tiie cultivation glucose concentration in ti>e fermentor was maintained above 5 g 
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r' by addmg continuously sterile glucose (40 g 1'*) solution. The mycelium was sq)arated 
from the medium by centrifugation and the culture siqxmatant was concentrated 1.6 times by 
ultrafiltration (PCI ES 625 membranes). 

5 The clarified supernatant was first fractionated by hydrophobic interaction chromatography. 
The pH of the sample was was adjusted to pH 6.0 and conductivity of the sample to tiie value 
corresponding to 10 mM sodium phosphate buffer, pH 6.0, containing 125 moi 1** (NH4)2S- 
O4. The sample was applied to a column (1 13 x 1 10 mm) of Phenyl Sepharose FF (Pharma- 
cia), previously equilibrated with 10 mM sodium phosphate buffer, pH 6.0, containing 1.25 • 
10 mol (NH4)2S04. Elution was started by the equilibrating buffer followed by a linear dec- 
reasing gradient of ammonium sulphate from 1.25 M to 0 M. Fractions (each 450 ml) which 
contained the major endoglucanase activity were combine4 eluted at the end of the decreasing 
gradient and by 10 mM phosphate buffer. The other adsorbed proteins were eluted by distilled 
water and the column was washed with 6 M urea. 

15 

The enzyme preparation obtained in the first chromatographic step was equilibrated to 4 mM 
sodium phosphate, pH 7.2 by gel filtration (Sephadex G-25 coarse). The equilibrated protein 
solution was applied to a column (113 x 190 mm) of DEAE Sepharose FF (Pharmacia), pre- 
equilibrated with the same buffer. Elution was performed first with the equilibrating buffer to 

20 remove unadsorbed proteins and thereafter by stepwise additions of sodium chloride to con- 
centration of 200 mM. Fractions (each 900 ml) which contained the endoglucanase and 
which eluted by 200 mM NaCl were collected and the fraction with the highest activity was 
concentrated by ultrafiltration (Amicon PM-10 membranes). The specific activity of the 
preparate was 360 nkat mg'' protein and purification finctor of ca. 30 was obtained when 

25 compared to the supernatant of the fermentor liquid. 

The preparate was further characterized by isoelectric focusing on PBE-94 anion exchange 
material (Pharmacia). The colunm was equilibrated by 25 mM imidazole^HCl buffer, pH 7.4 
and eludon was carried out by Polybuffer 74 (Pharmacia) - HCl buffer, pH 4.0 according to 
30 the manufacturer's instructions. EGV, measured by p-glucanase activity, eluted from the 
column at pH 6.6 • 7.2. 
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Example 8 

pH-Opttmun of EGV 

1 .0 % solutions of barley p-glucan (Megazyme. Australia) was prepared in Mcllvaine buffos. 
5 diluted one to four (corresponding ca. 50 mM citrate-phosphate buffer), in pH range from 3.0 
to 8.0. Activity of an enzyme sample prepared as described in Example 7 was assayed using 
as substrate the p-gh>can solutions prepared in the varying pH-values. The assay procedure 
was odierwise similar to the procedure of endoglucanase assay GUPAC. 1987). Incubation 
time in the assay was 10 miii at 50 'C. after which the enzyme reaction was terminated by . 

10 boiUng. Reducing sugar groups formed in the reaction were measured by DNS-reaction. 



The pH-optimum of EGV was 6.0 - 6.5 (Figure 8). 



IS Example 9 

pH-StabilHy of EGV 

An EGV sample was prepared as described in Example 7. except that the last concentration 
. by ultrafiltration was omitted (activity 48 nkatAnl. assayed at pH 6 J against bariey P-gl«can. 
20 analogously to endoglucanase assay. lUPAC. 1987). Tliis sample was dihited (1 part per 2 
pans of buffer) by 100 mM buffers of sodium aceuue and sodium phosphate, prepared in 
different pH values. The diluted samples were incubated at 40 for 20 h. and the activity 
was assayed as described earlier. The pH of incubation was measured after the incubation. 

25 More tiian 80 % of the original activity was observed in die samples incubated at pH range 
from ca. pH 5.4 to ca. pH 6.8. Hie relative recovered activity is presented in Figure 9. 



Example 10 

30 Hydrolysis of insoluble cellulosic substrates by EGV. 

Avicel (Serva 14204) which is mainly crystalline ceUulosc and phosphoric acid-swollen 
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amorphous (Walseth, 1952) cellulose were hydrolysed by the new endoglucanase enzyme 
(EGV) alone and in combinations with the previously known cellobiohydrolases of Z reesei. 
The preparation of EGV was obtained as described in Example 7. CBHI and CBHII were 
purified from culture filtrates of Trichoderma reesei and were pure proteins as judged by 
5 SDS-PAGE. As a control, the cellobiohydrolases were incubated without the addition of EGV. • 
The reducing sugars liberated in the treatments were assayed using the DNS method and the 
reaction products were analysed by HPLC. 



The substrates for the hydrolysis were prepared in 50 mM sodium citrate buffer, pH 5.8 in . 

10 concentration of 10 g 1'. EGV was dosed on the basis of activity against p-glucan at pH 5.8 
(500 or 2000 nkat g'^ substrate) and cellobiohydrolases (CBHI and CBHII) on the basis of 
protein (1.0 or 4 mg g** substrate). The reaction mixtures were incubated for 20 h at pH 5.8 at 
40 ''C after which the hydrolysis was terminated by boiling. The values for reducing sugars as 
glucose assayed from the reaction mixture are presented in Tables 3 and 4. The enzyme 

15 dosage was 500 nkat g'^ substrate for EGV and 1.0 g g ' substrate for CBHI and CBHII 

(Table 3), and 2000 nkat g'^ substrate for EGV and 4.0 g g*^ substrate for CBHI and CBHII 
(Table 4). Duration of the hydrolysis was 20 h in both cases. 



The major hydrolysis product of EGV was cellobiose but also oellotetraose was detected in 
20 the hydrolysate by HPLC The strong synergy of EGV with CBHI in the hydrolysis of these 
substrates can be clearly seen. Even though EGV released only small amounts of soluble 
sugars the enhancing effect on the cellulose hydrolysis by CBHI was remarkable. 



25 Table 3. Reducing sugars liberated by cellulases jfrom T. reesei in the hydrolysis of 

crystalline cellulose (Avicel) and amoiphous (Walseth) cellulose 

enzymes reducing sugars as glucose (mg ml'') 

Avicel Walseth 



30 



EGV alone 0.00 0.01 (*) 

CBHI alone 0.09 0.16 
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CBHIl alone 
CBHI and EGV 
CBHII and EGV 

(♦) determined by HPLC 



0.23 
0.15 
023 



0.67 
0.25 

0.74 



10 



15 



20 



25 



Reducing sugars Uberated by ceUulases from T. reesei in the hydrolysis of crystalli 
nc cellulose (Avicel) and amorphous (Walseth) cellulose 



enzymes 



EGV alone 
CBHI alone 
CBHII alone 
CBHI and EGV 
CBHII and EGV 
EGV (larger dosage 
(♦♦) 



reducing sugars as glucose (mg ml *) 



Avicel 


Walseth 


0.03 (♦) 


0.05 


0.35 


0.49 


0.43 


1.44 


0.43 


0.71 


0.47 


1.67 


0.06 


0.13 



(♦) detennincd by HPLC 

(**) dosed activity 5000 nkat g ' substrate 



Example 11 

30 Hydrolyris of soluble ceUulosic subslnites by EGV 

HEC (hydroxyethyl cellulose. Fluka 54290) which is a soluble substituted ceUulose polymer 
■ and barley p-glucan (Megazyme, Australia) were hydrolysed by the new endogl«ea«« 
enzyme (EGV) alone and in combinations with two previously known endogluc«»ses of L 
35 reesei m prepanmon of EGV was obtained as described in Example 7. Endoglucanas« 

^d EGD were purified from cultme filtrates of Trichoderma reesei and were pure protems ^ 
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judged by SDS-PAGE. As a control, the endoglucanases were incubated widiotit the addition 
of EGV. The reducing sugars liberated in the treatments were assayed using the DNS method. 

The substrates for the hydrolysis were prepared in 50 mM sodium acetate buffer, pH 5.8 in 
5 concentration of 10 g 1'. Endoglucanases were dosed on the basis of activity against P-glucan 
(dosage for each: 100 nkat g ' substrate) The reaction mixtures were incubated at 40 *C after 
which the hydrolysis was terminated by boiling. The values for reducing sugars as glucose 
assayed from the reaction mbcture for HEC are presoited in Table 5, and for P-glucan in 
Table 6. The enhancing effect of EGV on the hydrolysis of HEC and especially P-glucan can 
10 clearly be seen. 

Table 5. Reducing sugars liberated by endoglucanases from T. reesei in the hydrolysis of 
hydroxyelhyl cellulose. 



enzymes 


hydrolysis products 


as reducing sugars (mg ml*^) 


2 h hydrolysis 


20 h hydrolysis 


EGV alone 


0.00 


0.00 


£GI alone 


0.14 


0.26 


EGU alone 


0.18 


0.29 


EGI and EGV 


0.16 


0.27 


EGII and EGV 


0.24 


0.29 



25 



Printed from Mimosa 04/26/2000 



W0«fl«117 PCT/F1*4«0234 

33 

Table 6. Reducing sugars Uberated by endoglucanases ftom T. reesei in the hydrolysis of 
barley p-glucan. Duration of hydrolysis 2 b. 



10 



30 



enzymes 


hydrolysis products as 
reducing sugars (mg ml'*) 


EGV alone 


0.96 


£GI alone 


l.l 


ECU alone 


0.90 


EGlandEGV 


2.4 


EGII and EGV 


2.0 



Example 12 

15 Modelliog of the cellulose-binding domain of EGV 

Sequences of the five cdlulose-binding dranams (CBD) of the T. reesei ccUulases were 
aligned to define conserved regions using the computer program MAUGN (Johnson et aL, 
1993; Johnson and Overington. 1993). vAach is suitable for this purpose because the percenta- 
20 gc identity between die five CBDs is -60%. The construction of a 3-D model of flie EGV 

CBD was perfonned using the COMPOSER method (Sutdiffc et aL, 1987a.b; BlundeU etaL, 
1988: Sali el id., 1990). v*ich is based on rules derived fiom known three-<Umensional 
structures. These rules can be used to define a conserved core for the model, to select ap^m- 
priate fragments for the variable regions and to replace the side chains. The NMR-based 
25 structure of the CBHI CBD (KrauUs et aL, 1989) was used as a basis for the EGV model. 

The computer program CHAR>f!m ver. 22 (Brooks et aL. 1983) was used to soak the comple- 
ted model in a 35 A cubic box of water and to refine the model through energy minimization 
and molecular dynamics simulation of 100 ps under periodic boundary conditions. 



The sequence aUgnment shows that the CBDs of T. reesei are highly conserved except for one 
insertion and one deletion of a single aa in EGV. Therefore most parts of the 3-0 structure of 
the CBHI CBD. determined by NMR (Kraulis et al., 1989). could be used as a conserved core 



Printed from Mimosa 04/26/2000 



wo 94/28117 PCT/F194/00234 

34 

for modelling of the EGV CBD by the computer program COMPOSER. The CBHI CBD is a 
wedge-shaped domain having two flat surfaces. One of these is predominantly hydrophiiic and 
contains three tyrosine residues that have been shown by chemical modification to be impor- . 
tant for the binding of the enzyme to cellulose (Claeysscns and Tonamc, 1989). Tyr*" located 
5 at the tip of the wedge has also been demonstrated by siter-dirccted mutagenesis to be involved ' 
in substrate binding (Reinikainen et aL, 1992). This residue is replaced by a tryptophan 
(Trp^) in the EGV CBD (Fig 3B). an amino acid substitution also seen in many other fungal 
CBDs. Both tyrosine and tryptophan residues interact readily with carbohydrates. 

10 The backbones of the CBM and EGV CBDs are veiy similar. Two disulfide bridges in 

identical positions stabilize the structures. The insertion and the deletion in EGV are situated 
in a single loop and thus compensate each other, maintaining the loop backbone unchanged 
compared with that of CBHI. However, there is an interesting difference in the backbone 
conformation at the other, more hydrophobic, flat face. A substantial change m torsion angle 

15 was observed at position Gly^ of EGV, where the ♦-angle of the glycine residue changes 
from negative to positive during the refinement simulation. This causes the loop at region 
217-221 to be pushed outwards. Interestingly, the corresponding loop in the CBHI CBD 
contains a proline residue (Pro*'^ mutation of which reduces die activity of CBHI against 
crystalline cellulose (Reinikainen et al., 1992). Simulations of the other CBDs finom T. reesei 

20 show that this rqgion is the most flexible region of the CBD (A.-M. HofiMn. T.T. Teen, and 
O. Teleman, submitted). 

There are also moderate differences in the backbone regions, in which proline (Pro«*) and 
serine (Ser^") residues of the CBHI CBD are replaced by glutamine (Gto"') and proline 

25 (Pro"') residues in the EGV CBD, respectively. Moderate differences are also found in side 
chain conformations, but these changes are within the range of fluctuation occurring duriiig 
simulation. One of the tyrosines (Tyr^*^ foiming the hydrophiUc face of EGV CBD points 
more upwards than its counterpart (Tyr*^ in CBHI. This difference in orientation is signifi- 
cant but the flexibiUty allows Tyr^**^ in EGV to occupy the same position as Tyr*" in CBHI. . 

30 Thus the difference in orientation is unlikely to affect substantially the affinity for cellulose. 
Overall, the EGV CBD seems to be less wedge-shaped and the hydrophobic surface more 
rounded than that of the CBHI CBD. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: CY ALKO AB 

(B) STREET: Salmisaarenranta 7 H 

(C) CITY: Helsinki 
10 (E) COUNTRY: Finland 

(F) POSTAL CX)DE: FIN-00160 

(ii) TITLE OF INVENTION: Novel Endoglucanase Enzyme 
15 (iii) NOKBBR OP SEQUENCES: 10 

(iv) COMPUTER READABLE FORM 

(A) MEDIUM TYPE: Floppy disc 

(B) COMPUTER: IBM PC coiflpatible 

20 (CI OPERATING SYSTEM: PC-DOS/MS-DOS 

(0) SOFTHARB: Patent In Releae #1.0/ Version #1.25 (E90) 

(v) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: FI 932521 
25 (B) FILING DATE: 2-JUNB-1993 

(2) INFORMATION FOR SBQ ID NO: 1: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 884 base pairs 
{B} TYPE: nucleic acid 

(C) STRANDEDNBSS : dOVlble 

(D) TOPOLOGY: linear 

35 

(ii) MOLECDLB TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
40 (iii) ANTI-SENSE: NO 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Trichodema reesei 

(B) STRAIN: QM9414 



45 



SO 



55 



60 



( ix) FEATURE : 

<A) NAME/KEY: CDS 

(B) LOCATXGfN: 40.. 765 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

QATCTTCCAT CTCOTGTCTT OCTTGTAACC ATCGTGACC ATG AAG GCA ACT CTQ 54 

Met Lya Ala Thr Leu 
1 5 

GTT CTC GGC TCC CTC ATT GTA GGC 6CC GTT TCC GCG TAG AAG OCC ACC 102 
val Leu Gly Ser Leu He Val Gly Ala Val Ser Ala Tyr Lys Ala Thr 

10 15 20 

ACC ACG CGC TAC TAC GAT GGG CAG GAG GGT GCT TGC GGA TGC GGC TCG 150 
Thr Thr Arg Tyr Tyr Asp Gly Gin Glu Gly Ala Cya Gly Cys Gly Ser 

25 30 35 



65 AGC TCC GGC OCA TTC CCO TOG CAG CTC GGC ATC GGC AAC GGA GTC TAC 198 
ser Ser Gly Ala Fhe Pro Trp Gin Leu Gly He Gly Asn Gly Val Tyr 
40 45 50 
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ACG GCT GCC GGC TCC CAG GCT CTC TTC GAC ACQ 6CC GGA GCT TCA 246 

Thr Ala Ala Gly Sex Gin Ala I*u Phe Asp Thr Ala Gly Ala Ser Trp 
55 60 65 

; TGC GGC GCC GGC TGC GOT AAA TGC TAC CAG CTC ACC TCC Jfi^ G^ 294 

5ly Ala Gly Cys Gly Lys Cys Tyr Gin Leu Thr Ser Thr Gly Gin 
70 75 
GCG CCC TGC TCC AGC TGC GGC ACG GTC GOT GCT GCT GTC CM ATC 342 
Ala Pro cys Ser Ser Cys Gly Thr Gly Gly Ala Al« Gly Gla Ser He 

90 95 100 

ATC GTC ATG QTC ACC AAC CTG TGC CCG AAC AAT GGG AAC GCG CAG TCG 390 
lie Val Met Val Thr Asn Leu Cys Pro Asn Aon Gly Asn Ala Gin Trp 
15 105 "0 

TGC CCG GTG GTC GGC GGC ACC AAC CAA TAC GGC TAC TAC CAT TTC 438 

CVS Pro Val Val Gly Gly Thr Asn Gin Tyr Gly Tyr Ser Tyr His Phe 
' 120 125 130 

GAC ATC ATG GCG CAG AAC GAG ATC TTT GGA GAC AAT GTC GTC GTC GAC 486 
Sp lie Met Ala Gin Asn Glu He Phe Qly Asp Asn Val Val Val Asp 
135 140 145 

25 TTT GAG CCC ATT GCT TGC CCC GGG CAG GCT GCC TCT Gl^ TGG GGG ACG 534 
Phe Glu Pro lie Ala Cys Pro Gly Gin Ala Ala Ser Asp Trp Gly Thr 
150 155 160 165 

TGC CTC TGC GTG GGA CAG CAA GAG ACG GAT CCC tiCQ CCC GTC CTC GGC 582 
30 Cys Leu Cys Val Gly Gin Gin Glu Thr Asp Pro Thr Pro Val Lou Gly 

AAC OAC AGO GGC TOl ACT CCT CCC GGG AOC TC6 CCG CCA GOS Aa T^ 630 
Ara Asp Thr Gly Ser Thr Pro Pro Gly Ser Ser Pro Pro Ala Thr Ser 

185 190 155 

TCG AGT CCG CCG TCT GGC GGC GGC CAG CAG ACG CTC TAT GGC CAG TCT 678 

Ser Ser Pro Pro Ser Gly Gly Gly Gin Gin Thr Leu Tj^ Gly Qln Cys 
200 205 210 

GGA QGT GCC GGC TGG ACG GGA CCT ACG ACG TGC CAG GCC CCA GGG J^C 726 
Gly Gly Ala Gly Trp Thr Gly Pro Thr Thr Cys Gin Ala Pro Gly Thr 
215 220 225 

45 TGC AAG GTT CAG AAC CAG TGG TAC TCC CAG TGT CTT CCT TQAGAAGGCC. 775 

Cys Lys Val Gin Asn Gin Trp Tyr Ser Gin Cys Leu Pro 
230 235 240 

CAAGATAGCC ATGTCTCTCT AGCATTCTTC CGQCQTCAGT CTGATCT6CC TATTTAATCA 835 
GGTCAGTCAA TATGTATCCA GAGATAATAA ATTATGTATA TTATAGCAO »84 

(2) INFORMATION FOR SEQ ID NO: 2 

(1) SEQX7ENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 
{D) TOPOLOGY: linear 

(ii) MOLECOLE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
65 Met Lys Ala Thr Leu Val Leu Gly Ser Leu He Val Gly Ala Val Ser 

1 5 10 -15 



35 



40 



50 



55 



60 
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Ala Tyr Lys Ala Thr Thr Thr Arg Tyr Tyr Asp 6ly Gin Qlu Gly Ala 

20 25 30 

Cys Gly eye Gly Ser Ser Ser Gly Ala Phe Pro Trp Gin I-eu Gly lie 
5 35 40 45 

Gly Asn Gly Val Tyr Thr Ala Ala Gly Ser Gin Ala Leu Phe Asp Thr 
50 55 60 

10 Ala Gly Ala Ser Trp Cys Gly Ala Gly Cys Gly Lys -Cys Tyr Gin Leu 

65 70 75 80 



15 



30 



45 



55 



60 



Thr Ser Thr Gly Gin Ala Pro Cys Ser Ser Cys Gly Thr Gly Gly Ala 

85 90 95 

Ala Gly Gin Ser lie lie Val Met Val Thr Asn Leu Cys Pro Asn Asn 

100 105 110 



Gly Asn Ala Gin Trp Cys Pro Val Val Gly Gly Thr Asn Gin Tyr Gly 
20 115 120 125 

Tyr Ser Tyr His Phe Asp He Net Ala Gin Asn Olu He Phe Gly Asp 

130 135 140 

25 Asn Val Val Val Asp Phe Glu Pro He Ala Cys Pro Gly Gin Ala Ala 
145 ISO 155 160 



Ser Asp Trp Gly Thr Cys I^u Cys Val Gly Gin Gin Glu Thr Asp Pro 

165 170 175 

Thr Pro Val Leu Gly Asn Asp Thr Gly Ser Thr Pro Pro Gly Ser Ser 

ISO 185 190 



Pro Pro Ala Thr Ser Ser Ser Pro Pro Ser Gly Gly Gly Gin Gin Thr 
35 195 200 205 

Leu Tyr Gly Gin Cys Gly Qly Ala Gly Trp Thr Gly Pro Thr Thr Cys 
210 215 220 

40 Gin Ala Pro Gly Thr Cys Lys Val Gin Asn Gin Trp Tyr Ser Gin Cys 
225 230 235 240 



Leu Pro 



(2) INFORMATION FOR SEQ ZD NO: 3: 



(1) SEQUENCE CHARACTERISTICS: 
SO (A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) NOLECOLS TXPB: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 3: 

Tyr Lys Ala Thr Thr Thr Arg Tyr Tyr Asp Gly Gin Glu Gly Ala Cys 
1.5 10 IS 

Gly Cys Gly Ser Ser Ser Gly Ala Phe Pro Trp Gin Leu Gly He Gly 

20 25 30 
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Asn Gly Val Tyr Thr Ala Ala Gly Sex Gin Ala Leu Phe Asp Thr Ala 
35 40 45 

Gly Ala Ser Tzp Cys Gly Ala Gly Cys Gly Lys Cys Tyr Gin Leu Thr 
5 50 55 60 

Ser Thr Gly Gin Ala Pro Cys Ser Ser Cys Gly Thr Gly Gly Ala Ala 
65 70 75 80 

10 Gly Gin Ser lie lie Val Met Val Thr Aan Leu Cys Tro Asn Asn Gly 

85 90 95 

Asn Ala Gin Trp Cys Pro Val Val Gly Gly Thr Asn Gin Tyr Gly Tyr 

100 105 110 

15 

Ser Tyr His Phe Asp lie Met Ala Gin Aan Glu lie Phe Gly Asp Asn 
115 120 125 

Val Val Val Asp Phe Glu Pro He Ala Cys Pro Gly Gin Ala Ala Ser 
20 130 135 140 

Asp Trp Gly Thr Cys Leu Cys Val Gly Gin Gin Glu Thr Asp Pro Thr 
145 150 155 160 

25 Pro Val lieu Gly Asn Asp 

165 



(2) INFORMATION FOR SEQ ID NO: 4: 

30 

(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 22X1 base pairs 

(B) TYPE: nucleic acid 

(C) SIlUa^DEDNESS : single 
35 (D) TOPOLOGY: linear 

(xi) SS0T3ENCE DESCRIPTION: SEQ ID NO: 4: 

6AATTCTCAC GGTGAATOTA OGCCTTTTOT AGGOTAGQAA TTGTCACTCA AOCACCCOCA 60 

40 

ACCTCCATTA CGCCTCCCCC ATAGAGTTCC CAATCAGTGA OTCATGOCAC TGTTCTCAAA 120 

TAGATTGGGG AGAAGTTGAC TTCC6CCCAG AGCTGAAGGT CGCACAACCG CATOATATAG 180 

45 GGTCGGCAAC GGCAAAAAAO CACGTGGCTC ACOGAAAAGC AAaATGTTTO CGATCTAACA 240 

TCCAGQAACC TGGATACATC CATCATGACO CACOAOCACT TTGATCTQCT GGTAAACTCG 300 

TATTCGCCCT AAACCGAAGT GCGTGGTAAA TCTACACGTG GOCCCCTTTC GGTATACTGC 360 

50 

GTGTGTCTTC TCTAGGTGCA r i ' CmHJC TT CCTCTAGTGT TGAATTGTTT OTOTTGGGAO 420 

TCCGAGCTGT AACTACCTCT GAATCTCTGG AOAATGOTOG ACTAACGACT ACCGTQCACC 480 

55 TGCATCAT6T ATATAATAGT GATCCTGAGA ACGGGGGTTT GGAGCAATGT GGGACTTTQA 540 

TGGTCATCAA ACAAAGAACG AAGAOGCCTC TTTTGCAAAG TTTTOTTTCG GCTACGOTGA 600 

AGAACTGGAT ACTTGTTOTG TCTTCTOTGT ATTTTTOTOG CAACAAGAGG CCA6A6ACAA 660 

60 

TCTATTCAAA CACCAAQCTT GCTCTTTTGA GCTACAAGAA CCTGTGGGGT ATATATCTAG 720 

AGTTGTGAAG TCGGTAATCC CGCTGTATA6 TAATACGAGT CGCATCTAAA TACTOCGAAG 780 

65 CT6CTGCGAA CCCGGAGAAT C6AGATGT6C TGGAAA6CTT CTAGCGA6CG GCTAAATTAG 640 

CATGAAAGGC TATGAGAAAT TCTGGAGACG GCTTGTTGAA TCATGGCGTT CCATTCTTCG 900 
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ACAAGCAAAQ CGTTCCGTCG CAGTAGCAGG CACTCATTCC CGAAAAAACT OGGAGATTCC 960 

TAAGTAGCGA TQGAACCOGA ATAATATAAT A66CAATACA WGAGTTGCC TCGACGGT7G 1020 

CAATGCAG6G GTACTGAfiCT TGGACATAAC TGTTCCOTAC CCCACCTCTT CTCAACCTTT 1080 

GGCGTTTCCC TGATTCA6CG TACCCGTACA A6TCGTAATC ACTATTAACC CAGACTOiACC 1140 

GGACGTGTTT TGCCCTTCAT TTGGAGAAAT AATGTCATTG OGATOTGTAA TTTOCCTGCT 1200 

T6ACC6ACTG GGGCT6TTCG AAGCCCGAAT 6TAQ6ATT6T TATCCX5AACT CTGCTCGTAG 1260 

AGGCATGTTG TQAATCTGTG TCGOGCAGGA CACGCCTCGA AGGTTCAC6G CAAG6GAAAC 1320 

15 CACC6ATA6C AGTGTCTAGT AGCAACCTGT AAAGCC6CAA TGCAGCATCA CTGGAAAATA 1380 

CAAACCAATG GCTAAAAGTA CATAA6TTAA TGCCTAAAGA AGTCATATAC CAGCGQCTAA 1440 

TAATTGTACA ATCAA6TG0C TAAACGTACC GTAATTTGCC AAC6COTTGT GGGGTT6CAG 1500 

20 

AA6CAACGGC AAAGCCCACT TCCCACGTTT GTTTCTTCAC TCAGTCCAAT CTCAGCTGGT 1560 

GATCCCCCAA TTGGGTC6CT TGTTTGTTCC GGTGAAGT6A AAQAAGAOkG A6GTAAGAAT 1620 

25 GTCTGACTCG GAGC6TTTTG CATACAAOCA AGGGCAGTGA TGGAAGACAG TQAAATGTT6 1680 

ACATTCAAOO AGTATTTAOC CAAGGGATGCTTSACTGTATC GTGtAAOGAQ GTTTGTCTGC 1740 

CGATACGACG AATACTGTAT AGTCACTTCT GATGAAGTGG TCCATATTGA AATGTAAGTC IBOO 

30 

GGCACT6AAC AGGCAAAA6A TTGAOTTOAA ACTGCCTAAG ATCTCGGGCC CTCGGGCTTC 1860 

GGCTTTGOGT GTACATGTTT GTGCTCCGGG CAAATGCAAA GTGTG6TA6G ATC6ACACAC 1920 

35 TGCTGCCTTT ACCAAGCAGC TGAGGGTATG TGATAGGCAA ATGTTCAOGG GCCACTGCAT 1980 

.GGTTTCGAAT AGAAAGAGAA OCTTAOCCAA 6AACAATAGC CGATAAAOAT AOCCTCATTA 2040 

AAC6AAATGA GCTAGTAOGC AAAGTCA60G AAT6TGTATA TATAAAG6TT CQAG6TC0GT 2100 

GCCTCCCTCA TGCTCTCCCC ATCTACTCAT CAACTCAGAT CCTCCAOGAG ACTTGTACAC 2160 

CATCTTTTCA GOCACAOAAA CCCAATAQTC AACC60GQAC TGOGCATCAT 6 2211 



40 



45 



55 



(2) ZNFOSMATIOH FOR 5BQ ID NO: 5; 



(i) SEOUSNCE CHARACTERISTICS: 
(A) LENGTH: 1627 baae pairs 
50 (B} TYPE: nucleic acid 

(C) STRAKDEDNESS : single 
(O) TOPOLOGY: linear 

(3ci) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GGC6GTATT0 GCTACAGCGO CCCCACGGTC TGCGCCA6CG OCACAACTTG CCAG6TCCT6 60 

AACCCTTACT ACTCTCAGTG CCTGTAAAGC TCCGTGCGAA AGCCTGACGC ACCGGTAGAlT 120 

60 TCTTGGTGAG CCCGTATCAT GACGGCGGCG GGAGCTACAT QGCCCCGGGT GATTTATTTT 180 

TTTTGTATCT ACTTCTOACC CTTTTCAAAT ATACGGTCAA CTCATCTTTC ACTGGAGATG 240 

CGGCCTGCTT GGTATTGCGA TGTTGTCAGC TTGGCAAATT GTGGCTTTCG AAAACACAAA 300 

ACQATTCCTT AGTAGCCATG CATCGGGATC CTTTAAOATA ACGGAATAGA AGAAAGAGGA 360 



65 
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45 

A^TTAAAA*A ^ CAAACATCCC GTrOlTiUVCC CSTASAATCQ CCOCICITCG • 

TCTATCCCAO TACCACGGCA AAGGTATIIC ATGAKGTTC AATOTTGATA TTGTTCCCGC 
CAOTATCOCT GCACCCCCAT CtCCGCGAAT CTCCTCrTCT CGAACGCGGT A6TGGCGCGC 
CAATWOTAA TGACCATAGG GAGACAAACA GCATAATAGC AAOCTGOAA ATTACTOGCG 
CAATAArCGA GAACACAGTG AGACCATAOC TGGOGQCXrTO GAA«fiCACTC TTGGAGACCA 
ACTTGTCCGT TCCGAGGCCA ACTTGCAITG CTGTaU«»C GATQACAACG TAGCCGAGGA 
CCGTCACAA6 OGACGCAAAfl TTGTCGCGGA TGA6GTCTCC GTAGATOOCA TAGCCQ6CAA 
XCCGAGAGTA GCXTCTCAAC AGGWGCCTT TTCGAAACCG GIAAACCTW TTCAGAOGTC 
CTAGCCGCAG CTCACCGIAC CAGTATCGAG GATO3ACGGC AGAATAGCAO TGGCTCTCCA 
GGATTTGACT 6GACAAAATC TTCOGTATT CCCAGGTcAC AGTGTCTGGC AGAAGTCCCT 
TCTCOCGTGC ANICQAAAGT CGCXATAfiXG CGCAATGAGA GCACAGTAGG AGAATABGAA 
CCCGCGAGCA CATTGTTCAA TCTCCACAK AATTOGATOA CTGCTGGGCA GAATGTOCTG 
CCTCCAAAAT CCTGCGTCCA ACAQATACTC TG6CA3G0GC TTCAOAWAA TCCCTCIGG8 
CCCCOGATA ASATGCAGCT CTGGATTCTC G6TtAC«ATG ATATCGCGA6 ASAGCAC6A0 
TTOmOUtOG AGGOACAGGA GGCATAG8TC GCOCAGGCCC ATAACCAGTC TTGCAOGCA 
^aTCTTAC CTCACGAGGA GCTCCTGATG CAGAAAC^C TCCKTOITCC TGATTGSGTT 
OAGAATTTCA ICGCrCCTGG ATCGTATGGT TGCTGGCAAO ACCCTOCTIA ACCGTOCCGT 
6TCATGGTCA «:TCT«nW CTTCGTCGCI GGCCTGTCTT TGCAAITOGA CAGCAAATGG 
TOOAGATCTC TCTATOGIGA CAGTCATGGT AGCGAWOCT AOGTGTCGTT GCACCCACAT 
;,SSOCGAAAT GCGAAGTOGA AAGAATTTCC CGGMTGCGGA AIGAAOTCIC GTCATTTXOT 
ACIOGTACTC GACACCTCCA CtGAAGTGTT AATAATGGAT CCACGATGCC AAAAAGCTTG 
TGOklGC 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQXJENCE C3IARACTERISTICS: 
(A) LENGTH: 1137 base pairs 
(B> TYPB: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOCnfj linear 

(Xi) SEOOBNCB DESCRIPTION: flEQ ID NO: 6: 
GAATTCTCAC OGTGAATOTA GGCCTTTTGT AGGGTAGOAA TTOTCACTCA 
ACCTCCATTA COCCTCCCCC ATAGAGTTCC CAATCAOTGA GTCATGGCAC 
TAGATTCGGG AGAAGTTGAC TTCOGCCCAG AGCTGAAGGT CGCACAACCG 
GGTCG6CAAC GOCAAAAAAG CACGTGGCTC ACCGAAAAGC AAGATGTTTG 
TCCAGGAACC TGGATACATC CATCATCACG CACGACCACT TTGATCTtKTT 
TATTC6CCCT AAACCGAAGT GCGT«GTAAA TCTACACGTG GGCCCCTTTC 



TGTTCTCAAA 
CATGATATA6 
CGATCTAACA 
GGTAAACTC6 
GGTATACTGC 



420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1560 
1620 
1627 



60 
120 
IBO 
240 
300 
360 



Printed from Mimosa 04/26/2000 



10 



wo W/28U7 PC:TyFI94/00Z34 

46 

GTGTGTCTTC TCTAGGTGCA TTCTTTCCTT CCTCTAGTGT TGAATTGTTT GTGTTGGGAG 420 

TCCGAGCTCT AACTACCTCT GAATCTCTGO AGAATOGTGG ACTAAC6ACT ACCGT6CACC 480 

TGCATCATOT ATATAATAGT GATCCTGAGA A6GG6G6TTT GGA6CAATGT GGGACTTTGA 540 

TGGTCATCAA ACAAA6AACG AAGACGCCTC TTTT6CAAA6 TTTTGTTTCG aCTACGGTGA 600 

AGAACTGGAT ACTTGTTGTG TCTTCTGTGT ATTTTTGTGG CAACAAGAOG CCAGAGACAA 660 

TCTATTCAAA CACCAAGCTT GCTCTTTTGA OCTACAAGAA CCTGTGGGOT ATATATCTAG 720 

TGGCCAGAAT 6CCTAGGTCA CCTCTAGAGA GTTGAAACTG CCTAAGATCT C6QQCCCTCG 780 

15 GGCTTCGGCT TTGGGTGTAC ATGTTTGTGC TCC6GGCAAA TGCAAAGTGT GGTAGGATC6 840 

ACACACTGCT GCCTTTACCA AGCAGCTGAG GGTATGTGAT AGGCAAATGT TCAGGG6CCA 900 

CT6CATGGTT TC6AATAGAA AQAGAAOCTT AOCCAAGAAC AATAGCCGAT AAAGATAOCC 960 

20 

TCATTAAACG AAATGAGCTA GTAG6CAAA6 TCAG06AAT6 TG131TATATA AAGGTTCGAG 1020 

GTCCGT6CCT CCCTCATGCT CTCCCCATCT ACTCATCAAC TCAGATCCTC CAGGAGACTT 1080 

25 GTACACCATC TTTTGAGGCA CAGAAACCCA ATAGTCAACC G0GGACT6C6 CATCATG 1137 

(2) INFORHATIGN FOR SEQ ID »0: 7: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single. 

(D) TOPOLOGY: linear 



35 



40 



50 



55 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGTCTGAAGG ACQTGGAATG ATGG 24 

(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GATGCATCGA TCGTCCGCG6 GTTGA6AGAA GTTGTTGGAT TGATCAAAAA G 51 

<2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 
60 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GAGAGACCGC GGT6ATCTTC CATCTCGT6T CTTOCTTGTA AC 42 

65 
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(2) INFORMATION FOR SBQ ID NO: 10: 

(i) SEQUENCE CHARACTER XSTICS : 

<A) LBNGlH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) T0P0XO6Y: linear 

(Xi) SEQUENCE DESCRIPTION; SEQ ID NO: 10: 
ATCGTGGATC CATTATTAAC ACTTCGGTOG 
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Claims: 

1. An isolated DNA sequence, characterized in that it codes for a Trichoderma 

m 

en2yme having endoglucanase activity, the molecular weight of the unglycosylated form of 
5 said enzyme being 20 to 25 kDa and said eazymc comprising a core domain, a linker region 
and a cellulose binding domain, and fimctiona] parts thereof. 

2. The DNA sequence according to claim 1, wherein the DNA sequence hybridizes to the 
DNA sequence of SEQ ID NO. 1 or to the DNA sequence of SEQ ID NO. 11. 

10 

3. The DNA sequence according to claim 1, wherein the DNA sequence codes for the amino 
acid sequence of SEQ ID NO. 2. 

4. The DNA sequence according to claim 1, wherein the DNA sequence is the DNA 
15 sequmce of SEQ ID NO. 1 or to the DNA sequence of SEQ ID NO. 11. 

5. A DNA sequence, which codes for a polypeptide having endoglucanase activity, said 
sequence coding for the sequence of SEQ ID NO. 3 or functional equivalents thereof. 

20 6. A vector construction, characterized in that it comprises the DNA sequence of 
any one of claims 1 to 5. 

7. A microorganism host, characterized in that it has been transformed with the 
DNA sequence of any one of claims 1 to 5 or with a vector construction of claim 6 and is 

25 able to express said DNA sequence. 

8. The host according to claim 7, wherein said host is a fungal or yeast host 

9. The host according to claims 7 or 8, wherein said host is Trichoderma. 

30 

10. The host according to claims 7 or 8, wherein said host is Saccharomyces. 
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11. A culture medium, characterized in that is comprises the enzymes secreted 
from the host of claims 7 to 10. 

12. A product derived from the culture medium of claim 1 1 by purifying, concentrating, 
5 drying or immobilizing said culture medium. 

13. A method for isolating a DNA sequence coding for Trichoderma enzyme having cndoglu- 

canase activity, 

characterized by 
10 - enriching the mRNA pool of a Trichoderma strain producing cndoglucanase activity in 

respect of the mRNA of the cndoglucanase by culturing the Trichoderma strain in 
conditions which will induce the endoglucanase production of said strain, 

- isolating mRNA from the strain, 

• preparing cDNA corresponding to the isolated mRNA, 
15 . piadng the cDNA thus obtained in a vector under the control of a yeast promoter, 

. transformmg the recombinant plasmids into a yeast strain which naturally does not 
produce the endoglucanase in order to provide an expression library, 

- cultivating the yeast clones thus obtained on a culture medium in order to express the 

expression library in the yeast, 
20 - separating the yeast clones producing the cndoglucanase from the other yeast clones. 

. isolating the plasmid-DNA of said separated yeast clones, and, 
. if desired, sequencing the DNA in order to dctcraiine the DNA sequence coding for the 
endoglucanase. 

25 14. The method according to claim 13, wherein the recombinant plasmids are transfonned into 
a strain of the yeast Saccharomyces cerevisiae. 

15. The method accordmg to claim 13, wherein the yeast clones arc cultivated on a culture 
medium containing at least one substrate selected from the group comprising p-glucan, hydro- 

30 xyethyl cellulose, mcthylumbelliferyl lactoside and methylumbelliferyl ceUobioside. 

16. A method for constructing a Trichoderma ho^ capable of expressing an endoglucanase 
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enzyme, characterized in that it comprises 

a) isolating the DNA sequence coding for an endogiucanase, the molecular weight of 
which in unglycosylated fonn is 20 to 25 kDa, or parts thereof, from a suitable donor 
strain, - 

5 b) constructing a vector canying said DNA sequence and 

c) transfoiming the vector obtained into a Trichoderma host. 

17. A method for producing a Trichoderma endogiucanase enzyme, characterized 
in that it comprises the stqis of 
10 a) isolating the DNA sequence coding for an endogiucanase, the molecular weight of 

which in unglycosylated fonn is 20 to 25 kDa, or functional parts thereof, from a 

suitable donor strain 

b) constructing a vector carrying said DNA sequence, 

c) transforming the vector obtained to a Trichoderma host to obtain a recombiiiBnt host 
15 strain, 

d) cultivating said recombinant host strain under conditions permitting e^qiression of 
said endogiucanase, and 

e) recovering said endogiucanase. 

20 18. A method for constructing a Saccharomyces host capable of e3q>ressing an endogiucanase 
enzyme, characterized in that it comprises 

a) isolating the DNA sequence coding for an endogiucanase, the molecular weight of 
wliose unglycosylated form being 20 to 25 kDa, and functional parts thereof^ from a 
suitable donor stram, 
25 b) constructing a vector canying said DNA sequence and 

c) transforming the vector obtained into a Saccharomyces host 

19. A method for producing a Trichoderma endogiucanase enzyme, characterized 

by 

4 

30 a) isolating the DNA sequence coding for an endogiucanase, the molecular weight of 

which is 20 to 25 kDa in unglycosylated form, or functional parts thereof, from a 
suitable donor strain 
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b) constructing a vector canying said DNA sequence, 

c) transfonning the vector obtained to a Saccharomyces host to obtain a recombinant 
host strain, 

d) cultivating said recombinant host strain under conditions permitting expression of 
i said endoglucanase, and 

e) recovering said endoglucanase. 

20. An enzyme preparation, characterized in that it contains an endoglucanase 
enzyme having the amino acid sequence of SEQ ID NO. 2 or functional derivatives thereof. 



10 



15 



20 



21. An enzyme preparation, characterized in that it contains elevated levels of an 
endoglucanase enzyme having in unglycosylated form a molecular weight m the range fcrni 
20 to 25 kDa, or functional pans thereof, and exhibiting catalytic activity towards the 
substrates p-glucan. 

22. The enzyme preparation according to claim 22, wherein the endoglucanase enzymes 
exhibits activity towards crystalline cellulose substrate. 

23. A metiiod for enzymatically modifying a cellulosic substrate, char act e r - 

i zc d by contacting said substrate witii an eoQrme preparation according to any one of 
claims 20 to 22. 

24. The method according to claim 23, v^diercin the cellulosic substrate is fibrous. 
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GATCTTCCATCrrCGTGTCTTGCTTGTAACCATCGTGACCATGAAGGCAACTCTGGTTC^ (bO 

MetLysAIaThrLeuValLeu ? 

GGCTCCCTCATT6TAGGCGCCGTTTCCGCGTACAAGGCCACCACCACGCGCTACTACGAT 120 

GlySerLeuileValGlyAlaValserAla TyrLysAlaThrThrThrArgTyrryrAsp 2? 

GGGCAGGAGGGTGCTTGCGGATGCGGCTCGAGCTCCGGCGCATTCCCGTGGCAGCTCGGC ^£0 

GlyGlnGXuGlyAlacysGlyCysGlySerSerSerGlyAlaPheProTrpGlnLeuGly 

ATCGGCAACGGAGTCTACACGGCTGCCGGCTCCCAGGCTCTCTTCGACACGGCCGGAGCT 2*(0 
IleGlyAsnGlyValTyrXhrAlaAlaGlySerGlnAlal-euPheAspThrAlaGlyAla M 

TCATGGTGCGGCGCCGGCTGCGGTAAATGCTACCAGCTCACCTCCACGGGCCAGGCCCCC 30O 

SerTrpCysGlyAlaGlycysGlyLysCysTyrGlnLeuThrSerThrGlyClnAlaPro 51- 
T6CTCCAGCT6C6GCACGGGCGGTGCTGCTGGCCAGAGCATCATCGTCATGGTGACCAAC 

CysSarSerCysGlyThrGlyGlyAiaAlaGlyGlnSerllellaValMetValThrAsn 10? 

CTGTGCCCGAACAATGGGAACGCGCAGTGGTGCCCGGTGGTCGGCGGCACCAACCAATAC H20 
LeuCysProAsnAsnGlyAsnAlaGlnTrpCysProValValGlyGlyThrAsnGlnTyr 

GGCTACAGCTACCATTTCGAaiT»TGGa5CAGAAaSAGATCTTT66A6ACAATGTC^ HiO 

GlyryrSerTyrHisPhaAspIlelletAlaGlnAsnGluIIePheGlyAspAsnValVal IV? 

GTCGACTTTC^GCCCATTGCTTGCCCCCGGaiGGCTGCCTCTGACTGGGGCACG SHO 

ValAspPheGluProIleAXaCysProGXyGlnAlaAlaSerAspTrpGlyThrcysLsu 

TGCGTGGGACAGCAAGAGACCGATCCCACGCCCGTCCTCGGCAACGACACGGGCTCAACT 

cysValGlyGlnGlnGluThrAapProThrProValLeuGly^nAsp^GlySerThr 15? 

CCTCCCGGGAGCTCGCCGCCAGCGACATCGTCGAGTCCGCCGTCTG6CGCCGGCCA6CAG itO 

ProProGlySerSerProProAlaThrSerScrSarProProSerGlyGlyGlyGlnGln 20T 

ACCCTCTATGGCCAGTGTGGAGGTGCCGGCTCGACGGGACCTACGACGTGCCAGGCCCCA 

ThrLeuTyrGlyGlncysGlyGlyAlaGlyrrpThrGlyProThrThrcysGlnAlaPro 22?- 

GGGACCTGCAAGGTTCAGAACCAGTGGTACTCCCAGTGTCTTCCTTGAGAAGGCCCAAGA 1^0 

GlyThrCysLysValGlnAsnGlnTrpTyrSerGlnCysLeuPro 2,HZ 

TAGCCATGTCTCTCTAGCATTCTTCCGGCGTCAGTCTGATCTGCCTATTTAATCAGGTCA WO 

GTCAATATGTATCCAGAGATAATAAATTATGTATATTATA6CAG (A) ^ ^2-3 
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GATCTTCCATCTCGTGTCrrGCTTGTAACCATCGTGACCATGAAGGC^^^^ 

GGCTCCCTCATTGTAGGCGCCGTTTCCGCGTACAAGgCACC^C^^^ 
GlySerLeulleValGlyAlaValSerAlaTyrLysAlaThrThrTMr 

, irrT<-TAACCGAAAGGCCAGCGCTACTACGAT 
r.rTTr (; fif.C!TC; ^r fitr(;7rTSCTrfiftrAftCCTY*ftftrr^W/W^^ ^ArgTyrTyrAsp 

— intron i (60 bp) 

^TTrnACCTT G 'T-rTnnnGCAAGr.r.RCTCGTCACTTftrftT^ ^LeuG 46 

intron 2 (62 op) 



60 



120 
23 

180 
27 

240 
45 




GATAGCCATGTCTCTCTAGCATTCTTCCGGCGTCAGTCT^^^^ 
CAGTCAATATGTATCCAGAGATAATAAATTATGTATATTATAGCAG (A) 3, 
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